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Description 

BACKGROUND OF THE INVENTION 

1. Field of the Invention 

[0001] The present invention relates to novel polynucleotides derived from microorganisms belonging to coryneform 
bacteria and fragments thereof, polypeptides encoded by the polynucleotides and fragments thereof, polynucleotide 
arrays comprising the polynucleotides and fragments thereof, computer readable recording media in which the nucle- 
otide sequences of the polynucleotide and fragments thereof have been recorded, and use of them as well as a method 
of using the polynucleotide and/or polypeptide sequence information to make comparisons. 

2. Brief Description of the Background Art 

[0002] Coryneform bacteria are used in producing various useful substances, such as amino acids, nucleic acids, 
vitamins, saccharides (for example, ribulose), organic acids (for example, pyruvic acid), and analogues of the above- 
described substances (for example, N-acetylamino acids) and are very useful microorganisms industrially. Many mu- 
tants thereof are known. 

[0003] For example, Corynebacterium glutamicum is a Gram-positive bacterium identified as a glutamic acid-pro- 
ducing bacterium, and many amino acids are produced by mutants thereof. For example, 1,000,000 ton/year of L- 
glutamic acid which is useful as a seasoning for umami (delicious taste), 250,000 ton/year of L-lysine which is a valuable 
additive for livestock feeds and the like, and several ^ndre^t^/yejaror more.olothecaminoacids, such as L-argir,;ne, 
Caroline, L-glutamine, L-tryplophan, and the like, have been produced in the world (Nikkei Bio Yearbook 99, published 
by Nikkei BP (1998)). 

[0004] The production of amino acids by Corynebacterium glutamicum is -mainly r carriied duf by its mutants (mefebollc 
mutants) which have a mutated metabolic pathway and regulatory systems. Jn general, an organism is provided with 
various metabolic regulatory systems so as not to produce more amino acids than it needs. In the biosynthesis of L- 
lysine, for example, a microorganism belonging to the genus Corynebacterium is under such regulation as preventing 
the excessive production by concerted inhibition by lysine and threonine against the activity of a biosynthesis enzyme 
common to lysine, threonine and methionine, i.e., an aspartokinase, (J. Biochem., 65: 849-859 (1969)). The biosyn- 
thesis of arginine is controlled by repressing the expression of its biosynthesis gene by arginine so as not to biosyn- 
thesize an excessive amount of arginine (Microbiology 142: 99-108 (1996)). It is considered that these metabolic 
regulatory mechanisms are deregulated in amino acid-producing mutants. Similarly, the metabolic regulation is dereg- 
ulated in mutants producing nucleic acids, vitamins, saccharides, organic acids and analogues of the above-described 
substances so as to improve the productivity of the objective product. 

[0005] However, accumulation of basic genetic, biochemical and molecular biological data on coryneform bacteria 
is insufficient in comparison with Escherichia coli, Bacillus subtilis, and the like. Also, few findings have been obtained 
on mutated genes in amino acid-producing mutants. Thus, there are various mechanisms, which are still unknown, of 
regulating the growth and metabolism of these microorganisms. 

[0006] A chromosomal physical map of Corynebacterium glutamicum ATCC 13032 is reported and it is known that 
its genome size is about 3,100 kb (Mol. Gen. Genet, 252: 255-265 (1996)). Calculating on the basis of the usual gene 
density of bacteria, it is presumed that about 3,000 genes are present in this genome of about 3, 1 00 kb. However, only 
about 100 genes mainly concerning amino acid biosynthesis genes are known in Corynebacterium glutamicum, and 
the nucleotide sequences of most genes have not been clarified hitherto. 

[0007] In recent years, the full nucleotide sequence of the genomes of several microorganisms, such as Escherichia 
coli, Mycobacterium tuberculosis, yeast, and the like, have been determined (Science, 277: 1453-62 (1997); Nature, 
393: 537-544 ( 1 998); Nature, 387: 5-1 05 ( 1 997)). Based on the thus determined full nucleotide sequences, assumption 
of gene regions and prediction of their function by comparison with the nucleotide sequences of known genes have 
been carried out. Thus, the functions of a great number of genes have been presumed, without genetic, biochemical 
or molecular biological experiments. 

[0008] In recent years, moreover, techniques for monitoring expression levels of a great number of genes simulta- 
neously or detecting mutations, using DNA chips, DNA arrays or the like in which a partial nucleic acid fragment of a 
gene or a partial nucleic acid fragment in genomic DNA other than a gene is fixed to a solid support, have been 
developed. The techniques contribute to the analysis of microorganisms, such as yeasts, Mycobacterium tuberculosis, 
Mycobacterium bovis used in BCG vaccines, and the like (Science, 278: 680-686 (1997); Proc. Natl. Acad. Sci. USA, 
96: 12833-38 (1999); Science, 284: 1520-23 (1999)). 
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SUMMARY OF THE INVENTION 

[0009] An object of the present invention is to provide a polynucleotide and a polypeptide derived from a microor- 
ganism of coryneform bacteria which are industrially useful, sequence information of the polynucleotide and the 
polypeptide, a method for analyzing the microorganism, an apparatus and a system for use in the analysis, and a 
method for breeding the microorganism. 

[0010] The present invention provides a polynucleotide and an oligonucleotide derived from a microorganism be- 
longing to coryneform bacteria, oligonucleotide arrays to which the polynucleotides and the oligonucleotides are fixed, 
a polypeptide encoded by the polynucleotide, an antibody which recognizes the polypeptide, polypeptide arrays to 
which the polypeptides or the antibodies are fixed, a computer readable recording medium in which the nucleotide 
sequences of the polynucleotide and the oligonucleotide and the amino acid sequence of the polypeptide have been 
recorded, and a system based on the computer using the recording medium as well as a method of using the polynu- 
cleotide and/or polypeptide sequence information to make comparisons. 

BRIEF DESCRIPTION OF THE DRAWING 

[001 1] Fig. 1 is a map showing the positions of typical genes on the genome of Corynebacterium glutamicum ATCC 
13032. 

[0012] Fig. 2 is electrophoresis showing the results of proteome analyses using proteins derived from (A) Coryne- 
bacterium glutamicum ATCC 13032, (B) FERM BP-7134, and (C) FERM BP-158. 

[001 3] Fig. 3 is a flow chart of an example of a system using the computer readable media according to the present 
invention. 

[0014]- Fig. 4 is a flow chart of an example of a system using the computer readable media according to the present 
invention. 

^DETAILED DESCRIPTION OF THE INVENTION 

[0015] This application is based on Japanese applications No. Hei. 11-377484 filed on December 16, 1999, No. 
2000-159162 filed on April 7, 2000 and No. 2000-280988 filed on August 3, 2000, the entire contents of which are 
incorporated hereinto by reference. 

[0016] From the viewpoint that the determination of the full nucleotide sequence of Corynebacterium glutamicum 
would make it possible to specify gene regions which had not been previously identified, to determine the function of 
an unknown gene derived from the microorganism through comparison with nucleotide sequences of known genes 
and amino acid sequences of known genes, and to obtain a useful mutant based on the presumption of the metabolic 
regulatory mechanism of a useful product by the microorganism, the inventors conducted intensive studies and, as a 
result, found that the complete genome sequence of Corynebacterium glutamicum can be determined by applying the 
whole genome shotgun method. 

[0017] Specifically, the present invention relates to the following (1) to (65): 
(1) A method for at least one of the following: 

(A) identifying a mutation point of a gene derived from a mutant of a coryneform bacterium, 

(B) measuring an expression amount of a gene derived from a coryneform bacterium, 

(C) analyzing an expression profile of a gene derived from a coryneform bacterium, 

(D) analyzing expression patterns of genes derived from a coryneform bacterium, or 

(E) identifying a gene homologous to a gene derived from a coryneform bacterium, 

said method comprising: 

(a) producing a polynucleotide array by adhering to a solid support at least two polynucleotides selected 
from the group consisting of first polynucleotides comprising the nucleotide sequence represented by any 
oneofSEQ IDNOS:1 to 3501 , second polynucleotides which hybridize with the first polynucleotides under 
stringent conditions, and third polynucleotides comprising a sequence of 10 to 200 continuous bases of 
the first or second polynucleotides, 

(b) incubating the polynucleotide array with at least one of a labeled polynucleotide derived from a co- 
ryneform bacterium, a labeled polynucleotide derived from a mutant of the coryneform bacterium or a 
labeled polynucleotide to be examined, under hybridization conditions, 

(c) detecting any hybridization, and 

(d) analyzing the result of the hybridization. 
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As used herein, for example, the at least two polynucleotides can be at least two of the first polynu- 
cleotides, at least two of the second polynucleotides, at least two of the third polynucleotides, or at least 
two of the first, second and third polynucleotides. 

(2) The method according to (1), wherein the coryneform bacterium is a microorganism belonging to the genus 
Corynebacterium, the genus Brevibactehum, or the genus Microbacterium. 

(3) The method according to (2), wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, Corynebacterium 
acetoglutamicum, Corynebacterium caliunae, Corynebacterium herculis, Corynebacterium liiium, Corynebacteri- 
um meiassecola, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

(4) The method according to (1), wherein the polynucleotide derived from a coryneform bacterium, the polynuce- 
lotide derived from a mutant of the coryneform bacterium or the polynucleotide to be examined is a gene relating 
to the biosynthesis of at least one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, 
an organic acid, and analogues thereof. 

(5) The method according to (1), wherein the polynucleotide to be examined is derived from Escherichia coii. 

(6) A polynucleotide array, comprising: 

at least two polynucleotides selected from the group consisting of first polynucleotides comprising the nucle- 
otide sequence represented by any one of SEQ ID NOS:1 to 3501, second polynucleotides which hybridize 
with the first polynucleotides under stringent conditions, and third polynucleotides comprising 10 to 200 con- 
tinuous bases of the first or second polynucleotides, and 
a solid support adhered thereto. 

As used herein, for example, the at least two polynucleotides can be at least two of the first polynucleotides, 
at least two of the^second polynucleotides, at least two of the third polynucleotides, or at least two of the first, 
"secondhand third polynucleotides. 

(7) A polynucleotide comprising the nucleotide sequence represented by SEQ ID NO:1 or a polynucleotide having 
a homology of at least 80% with the polynucleotide. 

(8) A polynucleotide comprising any one of the nucleotide sequences represented by SEQ ID NOS:2 to 3431, or 
a polynucleotide which hybridizes with the polynucleotide under stringent conditions. 

(9) A polynucleotide encoding a polypeptide having any one of the amino acid sequences represented by SEQ ID 
NOS:3502 to 6931, or a polynucleotide which hybridizes therewith under stringent conditions. 

(10) A polynucleotide which is present in the 5' upstream or 3' downstream of a polynucleotide comprising the 
nucleotide sequence of any one of SEQ ID NOS:2 to 3431 in a whole polynucleotide comprising the nucleotide 
sequence represented by SEQ ID NO:1, and has an activity of regulating an expression of the polynucleotide. 

(11) A polynucleotide comprising 10 to 200 continuous bases in the nucleotide sequence of the polynucleotide of 
any one of (7) to (10), or a polynucleotide comprising a nucleotide sequence complementary to the polynucleotide 
comprising 10 to 200 continuous based. 

(12) A recombinant DNA comprising the polynucleotide of any one of (8) to (11). 

(13) A transformant comprising the polynucleotide of any one of (8) to (11) or the recombinant DNA of (12). 

(14) A method for producing a polypeptide, comprising: 

culturing the transformant of (13) in a medium to produce and accumulate a polypeptide encoded by the 
polynucleotide of (8) or (9) in the medium, and 
recovering the polypeptide from the medium. 

(1 5) A method for producing at least one of an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, 
and analogues thereof, comprising: 

culturing the transformant of (13) in a medium to produce and accumulate at least one of an amino acid, a 
nucleic acid, a vitamin, a saccharide, an organic acid, and analogues thereof in the medium, and 
recovering the at least one of the amino acid, the nucleic acid, the vitamin, the saccharide, the organic acid, 
and analogues thereof from the medium. 

(16) A polypeptide encoded by a polynucleotide comprising the nucleotide sequence selected from SEQ ID NOS" 
2 to 3431. 

(17) A polypeptide comprising the amino acid sequence selected from SEQ ID NOS:3502 to 6931. 

(18) The polypeptide according to (16) or (17), wherein at least one amino acid is deleted, replaced, inserted or 
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added, said polypeptides having an activity which is substantially the same as that of the polypeptide without said 
at least one amino acid deletion, replacement, insertion or addition. 

(19) A polypeptide comprising an amino acid sequence having a homology of at least 60% with the amino acid 
sequence of the polypeptide of (16) or (17), and having an activity which is substantially the same as that of the 
polypeptide. 

(20) An antibody which recognizes the polypeptide of any one of (16) to (19). 

(21) A polypeptide array, comprising: 

at least one polypeptide or partial fragment polypeptide selected from the polypeptides of (16) to (19) and 
partial fragment polypeptides of the polypeptides, and 
a solid support adhered thereto. 

(22) A polypeptide array, comprising: 

at least one antibody which recognizes a polypeptide or partial fragment polypeptide selected from the polypep- 
tides of (16) to (19) and partial fragment polypeptides of the polypeptides, and 
a solid support adhered thereto. 

(23) A system based on a computer for identifying a target sequence or a target structure motif derived from a 
coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one nucleotide sequence information selected frorrvSEQ-ID NQS:-1 
to 350.1, -and-target-sequence or target structure r^tifTnTormation; 

(ii) a data storage device for at least temporarily storing the input information; 

(iii) a comparator that compares the atjeast pnejiucleotide sequence information selected from SEQ ID NOS: 
1 to 3501 with the target sequence or target structure motif information, recorded by the data storage device 
for screening and analyzing nucleotide sequence information which is coincident with or analogous to the 
target sequence or target structure motif information; and 

(iv) an output device that shows a screening or analyzing result obtained by the comparator. 

(24) A method based on a computer for identifying a target sequence or a target structure motif derived from a 
coryneform bacterium, comprising the following: 

(i) inputting at least one nucleotide sequence information selected from SEQ ID NOS:1 to 3501, target se- 
quence information or target structure motif information into a user input device; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one nucleotide sequence information selected from SEQ ID NOS:1 to 3501 with 
the target sequence or target structure motif information; and 

(iv) screening and analyzing nucleotide sequence information which is coincident with or analogous to the 
target sequence or target structure motif information. 

(25) A system based on a computer for identifying a target sequence or a target structure motif derived from a 
coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 , and target sequence or target structure motif information; 

(ii) a data storage device for at least temporarily storing the input information; 

(iii) a comparator that compares the at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 with the target sequence or target structure motif information, recorded by the data storage 
device for screening and analyzing amino acid sequence information which is coincident with or analogous to 
the target sequence or target structure motif information; and 

(iv) an output device that shows a screening or analyzing result obtained by the comparator. 

(26) A method based on a computer for identifying a target sequence or a target structure motif derived from a 
coryneform bacterium, comprising the following: 

(i) inputting at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 , and target 
sequence information or target structure motif information into a user input device; 
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(ii) at least temporarily storing said information; 

(iii) comparing the at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 
with the target sequence or target structure motif information; and 

(iv) screening and analyzing amino acid sequence information which is coincident with or analogous to the 
target sequence or target structure motif information. 

(27) A system based on a computer for determining a function of a polypeptide encoded by a polynucleotide having 
a target nucleotide sequence derived from a coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one nucleotide sequence information selected from SEQ ID NOS:2 
to 3501, function information of a polypeptide encoded by the nucleotide sequence, and target nucleotide 
sequence information; 

(it) a data storage device for at least temporarily storing the input information; 

(iii) a comparator that compares the at least one nucleotide sequence information selected from SEQ ID NOS: 
2 to 3501 with the target nucleotide sequence information, and determining a function of a polypeptide encoded 
by a polynucleotide having the target nucleotide sequence which is coincident with or analogous to the poly- 
nucleotide having at least one nucleotide sequence selected from SEQ ID NOS:2 to 3501; and 

(iv) an output devices that shows a function obtained by the comparator. 

(28) A method based on a computer for determining a function of a polypeptide encoded by a polypeptide encoded 
by a polynucleotide having a target nucleotide sequence derived from a coryneform bacterium, comprising the 
following: 

(i) inputting at least one nucleotide sequence information selected from SEQ ID NOS:2 to 3501, function in- 
formation of a polypeptide ejicodedjyjhe. nucleotide sequence, and target nucleotide sequence information ; 
= (ii)1af lefsf temporarily storing said information; 

(iii) comparing the at least one nucleotide sequence information selected from SEQ ID NOS:2 to 3501 with 
the target nucleotide sequence information; and 

(iv) determining a function of a polypeptide encoded by a polynucleotide having the target nucleotide sequence 
which is coincident with or analogous to the polynucleotide having at least one nucleotide sequence selected 
from SEQ ID NOS:2 to 3501. 

(29) A system based on a computer for determining a function of a polypeptide having a target amino acid sequence 
derived from a coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 , function information based on the amino acid sequence, and target amino acid sequence infor- 
mation; 

(ii) a data storing device for at least temporarily storing the input information; 

(iii) a comparator that compares the at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 with the target amino acid sequence information for determining a function of a polypeptide 
having the target amino acid sequence which is coincident with or analogous to the polypeptide having at least 
one amino acid sequence selected from SEQ ID NOS:3502 to 7001; and 

(iv) an output device that shows a function obtained by the comparator. 

(30) A method based on a computer for determining a function of a polypeptide having a target amino acid sequence 
derived from a coryneform bacterium, comprising the following: 

(i) inputting at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001, function 
information based on the amino acid sequence, and target amino acid sequence information; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 
with the target amino acid sequence information; and 

(iv) determining a function of a polypeptide having the target amino acid sequence which is coincident with or 
analogous to the polypeptide having at least one amino acid sequence selected from SEQ ID NOS 3502 to 
7001. 



(31) The system according to any one of (23), (25), (27) and (29), wherein a coryneform bacterium is a microor- 
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ganism of the genus Corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

(32) The method according to any one of (24), (26), (28) and (30), wherein a coryneform bacterium is a microor- 
ganism of the genus Corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

(33) The system according to (31), wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, Corynebacterium 
acetoglutamicum, corynebacterium caitunae, corynebacterium herculis, Corynebacterium lilium, Corynebacterium 
mefassecoia, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

(34) The method according to (32), wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, Corynebacterium 
acetoglutamicum, Corynebacterium calfunae, Corynebacterium herculis, Corynebacterium lilium, Corynebacteri- 
um metassecofa, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

(35) A recording medium or storage device which is readable by a computer in which at least one nucleotide 
sequence information selected from SEQ ID NOS:1 to 3501 or function information based on the nucleotide se- 
quence is recorded, and is usable in the system of (23) or (27) or the method of (24) or (28). 

(36) A recording medium or storage device which is readable by a computer in which at least one amino acid 
sequence information selected from SEQ ID NOS:3502 to 7001 or function information based on the amino acid 
sequence is recorded, and is usable in the system of (25) or (29) or the method of (26) or (30). 

(37) The recording medium or storage device according to 

(35) or (36), which is a computer readable recording medium selected from the group consisting of a floppy disc, 
a hard disc, a magnetic tape, a random access memory (RAM), a read only memory (ROM), a magneto-optic disc 
(MO), CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM and DVD-RW. 

(38) A polypeptide having a homoserine dehy^ro^nase_a^ivJ^,_c^prising.an.amino ac!d-sequence in which ifie 
Va! -residue a*rihe~59ih in~the amino acid sequence of homoserine dehydrogenase derived from a coryneform 
bacterium is replaced with an amino acid residue other than a Val residue. 

(39) A polypeptjdej:ojTirjrising an amino acid^sequence in which the Val residue at the 59th^p6sition in the amino 
acid sequence as represented by SEQ ID NO:6952 is replaced with an amino acid residue other than a Val residue. 

(40) The polypeptide according to (38) or (39), wherein the Val residue at the 59th position is replaced with an Ala 
residue. 

(41) A polypeptide having pyruvate carboxylase activity, comprising an amino acid sequence in which the Pro 
residue at the 458th position in the amino acid sequence of pyruvate carboxylase derived from a coryneform 
bacterium is replaced with an amino acid residue other than a Pro residue. 

(42) A polypeptide comprising an amino acid sequence in which the Pro residue at the 458th position in the amino 
acid sequence represented by SEQ ID NO:4265 is replaced with an amino acid residue other than a Pro residue. 

(43) The polypeptide according to (41 ) or (42), wherein the Pro residue at the 458th position is replaced with a Ser 
residue. 

(44) The polypeptide according to any one of (38) to (43), which is derived from Corynebacterium glutamicum. 

(45) A DNA encoding the polypeptide of any one of (38) to (44). 

(46) A recombinant DNA comprising the DNA of (45). 

(47) A transformant comprising the recombinant DNA of (46). 

(48) A transformant comprising in its chromosome the DNA of (45). 

(49) The transformant according to (47) or (48), which is derived from a coryneform bacterium. 

(50) The transformant according to (49), which is derived from Corynebacterium glutamicum. 

(51) A method for producing L-lysine, comprising: 

culturing the transformant of any one of (47) to (50) in a medium to produce and accumulate L-lysine in the 
medium, and 

recovering the L-lysine from the culture. 

(52) A method for breeding a coryneform bacterium using the nucleotide sequence information represented by 
SEQ ID NOS:1 to 3431, comprising the following: 

(i) comparing a nucleotide sequence of a genome or gene of a production strain derived a coryneform bacte- 
rium which has been subjected to mutation breeding so as to produce at least one compound selected from 
an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof by a fermentation 
method, with a corresponding nucleotide sequence in SEQ ID NOS:1 to 3431; 

(ii) identifying a mutation point present in the production strain based on a result obtained by (i); 

(iii) introducing the mutation point into a coryneform bacterium which is free of the mutation point; and 

(iv) examining productivity by the fermentation method of the compound selected in (i) of the coryneform 
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bacterium obtained in (iii). 

(53) The method according to (52), wherein the gene is a gene encoding an enzyme in a biosynthetic pathway or 
a signal transmission pathway. 

(54) The method according to (52), wherein the mutation point is a mutation point relating to a useful mutation 
which improves or stabilizes the productivity. 

(55) A method for breading a coryneform bacterium using the nucleotide sequence information represented by 
SEQ ID NOS:1 to 3431, comprising: 

(i) comparing a nucleotide sequence of a genome or gene of a production strain derived a coryneform bacte- 
rium which has been subjected to mutation breeding so as to produce at least one compound selected from 
an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof by a fermentation 
method, with a corresponding nucleotide sequence in SEQ ID NOS:1 to 3431; 

(ii) identifying a mutation point present in the production strain based on a result obtain by (i); 

(iii) deleting a mutation point from a coryneform bacterium having the mutation point; and 

(iv) examining productivity by the fermentation method of the compound selected in (i) of the coryneform 
bacterium obtained in (iii). 

(56) The method according to (55), wherein the gene is a gene encoding an enzyme in a biosynthetic pathway or 
a signal transmission pathway. 

(57) The method according to (55), wherein the mutation point is a mutation point which decreases or destabilizes 
the productivity. 

(58) /^method .for.breeding -a-CGryneform bacterium using the nucleotide sequence information represented by 
SEQ ID NOS:2 to 3431, comprising the following: 

(i) identifying anisozymeTelatihg to biosynthesis of at least one compound selected from an amino acid, a 
nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof, based on the nucleotide se- 
quence information represented by SEQ ID NOS:2 to 3431; 

(ii) classifying the isozyme identified in (i) into an isozyme having the same activity; 

(iii) mutating all genes encoding the isozyme having the same activity simultaneously; and 

(iv) examining productivity by a fermentation method of the compound selected in (i) of the coryneform bac- 
terium which have been transformed with the gene obtained in (iii). 

(59) A method for breeding a coryneform bacterium using the nucleotide sequence information represented by 
SEQ ID NOS:2 to 3431, comprising the following: 

(i) arranging a function information of an open reading frame (ORF) represented by SEQ ID NOS:2 to 3431; 

(ii) allowing the arranged ORF to correspond to an enzyme on a known biosynthesis or signal transmission 
pathway; 

(iii) explicating an unknown biosynthesis pathway or signal transmission pathway of a coryneform bacterium 
in combination with information relating known biosynthesis pathway or signal transmission pathway of a co- 
ryneform bacterium; 

(iv) comparing the pathway explicated in (iii) with a biosynthesis pathway of a target useful product; and 

(v) transgenetically varying a coryneform bacterium based on the nucleotide sequence information to either 
strengthen a pathway which is judged to be important in the biosynthesis of the target useful product in (iv) or 
weaken a pathway which is judged not to be important in the biosynthesis of the target useful product in (iv). 

(60) A coryneform bacterium, bred by the method of any one of (52) to (59). 

(61) The coryneform bacterium according to (60), which is a microorganism belonging to the genus Corynebac- 
terium, the genus Brevibacterium, or the genus Microbacterium. 

(62) The coryneform bacterium according to (61), wherein the microorganism belonging to the genus Corynebac- 
terium is selected from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, 
Corynebacterium acetoglutamicum, Corynebacterium callunae, Corynebacterium hercults, Corynebacterium ///- 
ium, Corynebacterium melassecola, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

(63) A method for producing at least one compound selected from an amino acid, a nucleic acid, a vitamin, a 
saccharide, an organic acid and an analogue thereof, comprising: 

culturing a coryneform bacterium of any one of (60) to (62) in a medium to produce and accumulate at least 
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one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and 
analogues thereof; 

recovering the compound from the culture. 

(64) The method according to (63), wherein the compound is L-lysine. 

(65) A method for identifying a protein relating to useful mutation based on proteome analysis, comprising the 
following: 

(i) preparing 

a protein derived from a bacterium of a production strain of a coryneform bacterium which has been sub- 
jected to mutation breeding by a fermentation process so as to produce at least one compound selected 
from an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogues thereof, and 
a protein derived from a bacterium of a parent strain of the production strain; 

(ii) separating the proteins prepared in (i) by two dimensional electrophoresis; 

(iii) detecting the separated proteins, and comparing an expression amount of the protein derived from the 
production strain with that derived from the parent strain; 

(iv) treating the protein showing different expression amounts as a result of the comparison with a peptidase 
to extract peptide fragments; 

(v) analyzing amino acid sequences of the peptide fragments obtained in (iv); and 

(vi) comparing the amino acid sequences obtained in (v) with the amino acid ^eqLJence.represented-by-SEQ 
ID NOS:3502 to 7001 to ide ntifying-t he-protein -ha ving the'arni ho aclc r sequences. 

As used herein, the term "proteome", which is a coined word by combining "protein" with "genome'Vrefers to 
a method forexamining of a gene at the polypeptide level. 

(66) The method according to (65), wherein the coryneform bacterium is a microorganism belonging to the genus 
Corynebactehum, the genus Brevibacterium; or the genus Microbacterium. 

(67) The method according to (66), wherein the microorganism belonging to the genus Corynebactehum is selected 
from the group consisting of Corynebactehum giutamicum, Corynebactehum acetoacidophilum, Corynebactehum 
acetoglutamicum, Corynebactehum callunae, corynebactehum herculis, Corynebactehum iilium Corynebactehum 
melassecola, Corynebactehum thermoaminogenes, and Corynebactehum ammoniagenes. 

(68) A biologically pure culture of Corynebactehum giutamicum AHP-3 (FERM BP-7382). 

[0018] The present invention will be described below in more detail, based on the determination of the full nucleotide 
sequence of coryneform bacteria. 

1. Determination of full nucleotide sequence of coryneform bacteria 

[0019] The term "coryneform bacteria" as used herein means a microorganism belonging to the genus Corynebac- 
tehum, the genus Brevibacterium or the genus Microbacterium as defined in Bergeys Manual of Determinative Bacte- 
riology 8: 599 (1974). 

[0020] Examples include Corynebactehum acetoacidophilum, Corynebactehum acetoglutamicum, Corynebactehum 
callunae, Corynebactehum giutamicum, Corynebactehum herculis, Corynebactehum lilium, Corynebactehum melas- 
secola, Corynebactehum thermoaminogenes, Brevibacterium saccharolyticum, Brevibacterium immariophiium, Brevi- 
bacterium roseum, Brevibacterium thiogenitafis, Microbacterium ammoniaphilum, and the like. 
[0021 ] Specific examples include Corynebactehum acetoacidophilum ATCC 1 3870, Corynebactehum acetoglutami- 
cum ATCC 15806, Corynebactehum callunae ATCC 15991, Corynebactehum giutamicum ATCC 13032, Corynebac- 
tehum giutamicum ATCC 13060, Corynebactehum giutamicum ATCC 13826 (prior genus and species: Brevibacterium 
flavum, or Corynebactehum tactofermentum), Corynebactehum giutamicum ATCC 14020 (prior genus and species: 
Brevibacterium divaricatum), Corynebactehum giutamicum ATCC 13869 (prior genus and species: Brevibacterium 
lactofermentum), Corynebactehum herculis ATCC 13868, Corynebactehum lilium ATCC 15990, Corynebactehum 
melassecola ATCC 17965, Corynebactehum thermoaminogenes FERM 9244, Brevibacterium saccharolyticum ATCC 
1 4066, Brevibacterium immariophiium ATCC 1 4068, Brevibacterium roseum ATCC 1 3825, Brevibacterium thiogenitalis 
ATCC 19240, Microbacterium ammoniaphilum ATCC 15354, and the like. 
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(1) Preparation of genome DNA of coryneform bacteria 

[0022] Coryneform bacteria can be cultured by a conventional method. 

[0023] Any of a natural medium and a synthetic medium can be used, so long as it is a medium suitable for efficient 
culturing of the microorganism, and it contains a carbon source, a nitrogen source, an inorganic salt, and the like which 
can be assimilated by the microorganism. 

[0024] In Corynebacterium glutamicum, for example, a BY medium (7 g/l meat extract, 10 g/l peptone, 3 g/l sodium 
chloride, 5 g/l yeast extract, pH 7.2) containing 1% of glycine and the like can be used. The culturing is carried out at 
25 to 35°C overnight. 

[0025] After the completion of the culture, the cells are recovered from the culture by centrifugation. The resulting 
cells are washed with a washing solution. 

[0026] Examples of the washing solution include STE buffer (10.3% sucrose, 25 mmol/l Tris hydrochloride, 25 mmol/ 
I ethylenediaminetetraacetic acid (hereinafter referred to as "EDTA"), pH 8.0), and the like. 

[0027] Genome DNA can be obtained from the washed cells according to a conventional method for obtaining ge- 
nome DNA, namely, lysing the cell wall of the cells using a lysozyme and a surfactant (SDS, etc.), eliminating proteins 
and the like using a phenol solution and a phenol/chloroform solution, and then precipitating the genome DNA with 
ethanol or the like. Specifically, the following method can be illustrated. 

[0028] The washed cells are suspended in a washing solution containing 5 to 20 mg/l lysozyme. After shaking, 5 to 
20% SDS is added to lyse the cells. In usual, shaking is gently performed at 25 to 40°C for 30 minutes to 2 hours. After 
shaking, the suspension is maintained at 60 to 70°C for 5 to 15 minutes for the lysis. 

[0029] After the lysis, the suspension is cooled to ordinary temperature, and 5 to 20 ml of Tris-neutralized phenol is 
added thereto, followed by gently shaking at room temperature for 15 to 45 mi nutes . 

[0030 ] After_shakinn, centrifugation (-15;G0G x g;-201fiinutes, 2~(FC)~ is carried out to fractionate the aqueous layer. 
[0031] After performing extraction with phenol/chloroform and extraction with chloroform (twice) in the same manner, 
3 mol/l sodium acetate solution (pH 5.2^and isopropanol are added to the aqueous layer at 1/10 times volume and 2 
times volumerof the aqueous layer, respectively, followed by gently stirring to precipitate the genome DNA. 
[0032] The genome DNA is dissolved again in a buffer containing 0.01 to 0.04 mg/ml RNase. As an example of the 
buffer, TE buffer (10 mmol/l Tris hydrochloride, 1 mol/l EDTA, pH 8.0) can be used. After dissolving, the resultant 
solution is maintained at 25 to 40°C for 20 to 50 minutes and then extracted successively with phenol, phenol/chloroform 
and chloroform as in the above case. 

[0033] After the extraction, isopropanol precipitation is carried out and the resulting DNA precipitate is washed with 
70% ethanol, followed by air drying, and then dissolved in TE buffer to obtain a genome DNA solution. 

(2) Production of shotgun library 

[0034] A method for produce a genome DNA library using the genome DNA of the coryneform bacteria prepared in 
the above (1) include a method described in Molecular Cloning, A laboratory Manual, Second Edition (1989) (hereinafter 
referred to as "Molecular Cloning, 2nd ed."). In particular, the following method can be exemplified to prepare a genome 
DNA library appropriately usable in determining the full nucleotide sequence by the shotgun method. 
[0035] To 0.01 mg of the genome DNA of the coryneform bacteria prepared in the above (1) , a buffer, such as TE 
buffer or the like, is added to give a total volume of 0.4 ml. Then, the genome DNA is digested into fragments of 1 to 
10 kb with a sonicator (Yamato Powersonic Model 50). The treatment with the sonicator is performed at an output of 
20 continuously for 5 seconds. 

[0036] The resulting genome DNA fragments are blunt-ended using DNA blunting kit (manufactured by Takara Shuzo) 
or the like. 

[0037] The blunt-ended genome fragments are fractionated by agarose gel or polyacrylamide gel electrophoresis 
and genome fragments of 1 to 2 kb are cut out from the gel. 

[0038] To the gel, 0.2 to 0.5 ml of a buffer for eluting DNA, such as MG elution buffer (0.5 mol/l ammonium acetate, 
10 mmol/l magnesium acetate, 1 mmol/l EDTA, 0.1% SDS) or the like, is added, followed by shaking at 25 to 40°C 
overnight to elute DNA. 

[0039] The resulting DNA eluate is treated with phenol/chloroform and then precipitated with ethanol to obtain a 
genome library insert. 

[0040] This insert is ligated into a suitable vector, such as pUC1 8 Smal/SAP (manufactured by Amersham Pharmacia 
Biotech) or the like, using T4 ligase (manufactured by Takara Shuzo) or the like. The ligation can be carried out by 
allowing a mixture to stand at 10 to 20°C for 20 to 50 hours. 

[0041] The resulting ligation product is precipitated with ethanol and dissolved in 5 to 20 u.l of TE buffer. 

[0042] Escherichia coli is transformed in accordance with a conventional method using 0.5 to 2 u.1 of the ligation 

solution. Examples of the transformation method include the electroporation method using ELECTRO MAX DHIOB 
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(manufactured by Life Technologies) for Escherichia coli. The electroporation method can be carried out under the 
conditions as described in the manufacturer's instructions. 

[0043] The transformed Escherichia coli is spread on a suitable selection medium containing agar, for example, LB 
plate medium containing 10 to 100 mg/l ampicillin (LB medium (10 g/l bactotrypton, 5 g/l yeast extract, 10 g/l sodium 
chloride, pH 7.0) containing 1.6% of agar) when pUC18 is used as the cloning vector, and cultured therein. 
[0044] The transformant can be obtained as colonies formed on the plate medium. In this step, it is possible to select 
the transformant having the recombinant DNA containing the genome DNA as white colonies by adding X-gal and 
IPTG (isopropyl-p-thiogalactopyranoside) to the plate medium. 

[0045] The transformant is allowed to stand for culturing in a 96-weil titer plate to which 0.05 ml of the LB medium 
containing 0.1 mg/ml of ampicillin has been added in each well. The resulting culture can be used in an experiment of 
(4) described below. Also, the culture solution can be stored at -80°C by adding 0.05 ml per well of the LB medium 
containing 20% glycerol to the culture solution, followed by mixing, and the stored culture solution can be used at any 
time. 

(3) Production of cosmid library 

[0046] The genome DNA (0.1 mg) of the coryneform bacteria prepared in the above (1) is partially digested with a 
restriction enzyme, such as Sau3AI or the like, and then ultracentrifuged (26,000 rpm, 18 hours, 20°C) under a 10 to 
40% sucrose density gradient using a 10% sucrose buffer (1 mol/l Nad, 20 mmol/l Tris hydrochloride, 5 mmol/l EDTA, 
10% sucrose, pH 8.0) and a 40% sucrose buffer (elevating the concentration of the 10% sucrose buffer to 40%). 
[0047] After the centrifugation, the thus separated solution is fractionated into tubes in 1 ml per each tube. After 
ccjrfirm i nglhej^ 

about 40 kb is precipitated with ethanol. 

[0048] The resulting DNA fragment is ligated to a cosmid vector having a cohesive end which can be ligated to the 
fragment. When the genome DNA is partially digested with Sau3A1, the partially digested product can be ligatld to, 
for example, the BamHl site of superCosI (manufactured by Stratagene) in accordance with the manufacture's instruc- 
tions. 

[0049] The resulting ligation product is packaged using a packaging extract which can be prepared by a method 
described in Molecular Cloning, 2nd ed. and then used in transforming Escherichia coli. More specifically, the ligation 
product is packaged using, for example, a commercially available packaging extract, Gigapack III Gold Packaging 
Extract (manufactured by Stratagene) in accordance with the manufacture's instructions and then introduced into Es- 
cherichia coli XL-1-BlueMR (manufactured by Stratagene) or the like. 

[0050] The thus transformed Escherichia coli is spread on an LB plate medium containing ampicillin, and cultured 
therein. 

[0051] The transformant can be obtained as colonies formed on the plate medium. 

[0052] The transformant is subjected to standing culture in a 96-well titer plate to which 0.05 ml of the LB medium 
containing 0.1 mg/ml ampicillin has been added. 

[0053] The resulting culture can be employed in an experiment of (4) described below. Also, the culture solution can 
be stored at -80°C by adding 0.05 ml per well of the LB medium containing 20% glycerol to the culture solution, followed 
by mixing, and the stored culture solution can be used at any time. 

(4) Determination of nucleotide sequence 
(4-1) Preparation of template 

[0054] The full nucleotide sequence of genome DNA of coryneform bacteria can be determined basically according 
to the whole genome shotgun method (Science, 269: 496-512 (1995)). 

[0055] The template used in the whole genome shotgun method can be prepared by PCR using the library prepared 

in the above (2) (DNA Research, 5: 1-9 (1998)). 

[0056] Specifically, the template can be prepared as follows. 

[0057] The clone derived from the whole genome shotgun library is inoculated by using a replicator (manufactured 
by GENETIX) into each well of a 96-well plate to which 0.08 ml per well of the LB medium containing 0. 1 mg/ml ampicillin 
has been added, followed by stationary culturing at 37°C overnight. 

[0058] Next, the culture solution is transported, using a copy plate (manufactured by Tokken), into each well of a 
96-well reaction plate (manufactured by PE Biosystems) to which 0.025 ml per well of a PCR reaction solution has 
been added using TaKaRa Ex Taq (manufactured by Takara Shuzo). Then, PCR is carried out in accordance with the 
protocol by Makino et at. {DNA Research, 5: 1-9 (1998)) using GeneAmp PCR System 9700 (manufactured by PE 
Biosystems) to amplify the inserted fragments. 
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[0059] The excessive primers and nucleotides are eliminated using a kit for purifying a PCR product, and the product 
is used as the template in the sequencing reaction. 

[0060] It is also possible to determine the nucleotide sequence using a double-stranded DNA plasmid as a template. 
[0061] The double-stranded DNA plasmid used as the template can be obtained by the following method. 
[0062] The clone derived from the whole genome shotgun library is inoculated into each well of a 24- or 96-well plate 
to which 1.5 ml per well of a 2 x YT medium (16 g/l bactotrypton, 10 g/l yeast extract, 5 g/l sodium chloride, pH 7.0) 
containing 0.05 mg/ml ampicillin has been added, followed by culturing under shaking at 37°C overnight. 
[0063] The double-stranded DNA plasmid can be prepared from the culture solution using an automatic plasmid 
preparing machine KURABO PI-50 (manufactured by Kurabo Industries), a multiscreen (manufactured by Millipore) 
or the like, according to each protocol. 

[0064] To purify the plasmid, Biomek 2000 manufactured by Beckman Coulter and the like can be used. 

[0065] The resulting purified double-stranded DNA plasmid is dissolved in water to give a concentration of about 0.1 

mg/ml. Then, it can be used as the template in sequencing. 

(4-2) Sequencing reaction 

[0066] The sequencing reaction can be carried out according to a commercially available sequence kit or the like. A 
specific method is exemplified below. 

[0067] To 6 u.l of a solution of ABI PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit (manufactured 
by PE Biosystems), 1 to 2 pmol of an M13 regular direction primer (M13-21) or an M13 reverse direction primer 
(MI3REV) (DNA Research, 5: 1-9 (1998)) and 50 to 200 ng of the template prepared in the above (4-1) (the PCR 
produc t or plas mid) to give 10 ul of a seque ncin g reaction.solution. 

[0068] A dye terminator sequencing reaction (35 to 55 cycles) is carried out using this reaction solution and GeneAmp 
PCR System 9700 (manufactured by PE Biosystems) or the like. The cycle parameter can be determined in accordance 
with a commercially available kit, for examplerthe manufacture's instructions attached with ABI PRISM Big Dye Ter- 
minator Cycle Sequencing Ready Reaction Kit. 

[0069] The sample can be purified using a commercially available product, such as Multi Screen HV plate (manu- 
factured by Millipore) or the like, according to the manufacture's instructions. 

[0070] The thus purified reaction product is precipitated with ethanol, dried and then used for the analysis. The dried 
reaction product can be stored in the dark at -30°C and the stored reaction product can be used at any time. 
[0071] The dried reaction product can be analyzed using a commercially available sequencer and an analyzer ac- 
cording to the manufacture's instructions. 

[0072] Examples of the commercially available sequencer include ABI PRISM 377 DNA Sequencer (manufactured 
by PE Biosystems). Example of the analyzer include ABI PRISM 3700 DNA Analyzer (manufactured by PE Biosystems). 

(5) Assembly 

[0073] A software, such as phred (The University of Washington) or the like, can be used as base call for use in 
analyzing the sequence information obtained in the above (4). A software, such as Cross_Match (The University of 
Washington) or SPS Cross.Match (manufactured by Southwest Parallel Software) or the like, can be used to mask 
the vector sequence information. 

[0074] For the assembly, a software, such as phrap (The University of Washington), SPS phrap (manufactured by 
Southwest Parallel Software) or the like, can be used. 

[0075] In the above, analysis and output of the results thereof, a computer such as UNIX, PC, Macintosh, and the 
like can be used. 

[0076] Contig obtained by the assembly can be analyzed using a graphical editor such as consed (The University 
of Washington) or the like. 

[0077] It is also possible to perform a series of the operations from the base call to the assembly in a lump using a 
script phredPhrap attached to the consed. 

[0078] As used herein, software will be understood to also be referred to as a comparator. 

(6) Determination of nucleotide sequence in gap part 

[0079] Each of the cosmids in the cosmid library constructed in the above (3) is prepared in the same manner as in 
the preparation of the double-stranded DNA plasmid described in the above (4-1). The nucleotide sequence at the end 
of the insert fragment of the cosmid is determined using a commercially available kit, such as ABI PRISM BigDye 
Terminator Cycle Sequencing Ready Reaction Kit (manufactured by PE Biosystems) according to the manufacture's 
instructions. 



EP 1 108 790 A2 



[0080] About 800 cosmid clones are sequenced at both ends of the inserted fragment to detect a nucleotide sequence 
in the contig derived from the shotgun sequencing obtained in (5) which is coincident with the sequence. Thus, the 
chain linkage between respective cosmid clones and respective contigs are clarified, and mutual alignment is carried 
out. Furthermore, the results are compared with known physical maps to map the cosmids and the contigs. In case of 
Corynebacterium glutamicum ATCC 13032, a physical map of Mol. Gen. Genet, 252:. 255-265 (1996) can be used. 
[0081] The sequence in the region which cannot be covered with the contigs (gap part) can be determined by the 
following method. 

[0082] Clones containing sequences positioned at the ends of the contigs are selected . Among these, a clone wherein 
only one end of the inserted fragment has been determined is selected and the sequence at the opposite end of the 
inserted fragment is determined. 

[0083] A shotgun library clone or a cosmid clone derived therefrom containing the sequences at the respective ends 
of the inserted fragments in the two contigs is identified and the full nucleotide sequence of the inserted fragment of 
the clone is determined. 

[0084] According to this method, the nucleotide sequence of the gap part can be determined. 
[0085] When no shotgun library clone or cosmid clone covering the gap part is available, primers complementary to 
the end sequences of the two different contigs are prepared and the DNA fragment in the gap part is amplified. Then, 
sequencing is performed by the primer walking method using the amplified DNA fragment as a template or by the 
shotgun method in which the sequence of a shotgun clone prepared from the amplified DNA fragment is determined. 
Thus, the nucleotide sequence of the above-described region can be determined. 

[0086] In a region showing a low sequence accuracy, primers are synthesized using AUTOFINISH function and 
NAVIGATING function of consed (The University of Washington), and the sequence is determined by the primer walking 
method to improve the sequence accuracy. 

[0087] Exampiesof tftelrTus defermrned"nucleotide sequence of the full genome include the full nucleotide sequence 
of genome of Corynebacterium glutamicum ATCC 13032 represented by SEQ ID NO:1. 

(7) Determination of nucleotide sequence of microorganism genome DNA using the nucleotide sequence represented 
by SEQIDNO:1 

[0088] A nucleotide sequence of a polynucleotide having a homology of 80% or more with the full nucleotide sequence 
of Corynebacterium glutamicum ATCC 13032 represented by SEQ ID NO:1 as determined above can also be deter- 
mined using the nucleotide sequence represented by SEQ ID NO:1, and the polynucleotide having a nucleotide se- 
quence having a homology of 80% or more with the nucleotide sequence represented by SEQ ID NO:1 of the present 
invention is within the scope of the present invention. The term "polynucleotide having a nucleotide sequence having 
a homology of 80% or more with the nucleotide sequence represented by SEQ ID NO:1 of the present invention" is a 
polynucleotide in which a full nucleotide sequence of the chromosome DNA can be determined using as a primer an 
oligonucleotide composed of continuous 5 to 50 nucleotides in the nucleotide sequence represented by SEQ ID NO: 

1, for example, according to PGR using the chromosome DNA as a template. A particularly preferred primer in deter- 
mination of the full nucleotide sequence is an oligonucleotide having nucleotide sequences which are positioned at 
the interval of about 300 to 500 bp, and among such oligonucleotides, an oligonucleotide having a nucleotide sequence 
selected from DNAs encoding a protein relating to a main metabolic pathway is particularly preferred. The polynucle- 
otide in which the full nucleotide sequence of the chromosome DNA can be determined using the oligonucleotide 
includes polynucleotides constituting a chromosome DNA derived from a microorganism belonging to coryneform bac- 
teria. Such a polynucleotide is preferably a polynucleotide constituting chromosome DNA derived from a microorganism 
belonging to the genus Corynebacterium, more preferably a polynucleotide constituting a chromosome DNA of Co- 
rynebacterium glutamicum. 

2. Identification of ORF (open reading frame) and expression regulatory fragment and determination of the function of 
ORF 

[0089] Based on the full nucleotide sequence data of the genome derived from coryneform bacteria determined in 
the above item 1, an ORF and an expression modulating fragment can be identified. Furthermore, the function of the 
thus determined ORF can be determined. 

[0090] The ORF means a continuous region in the nucleotide sequence of mRNA which can be translated as an 
amino acid sequence to mature to a protein. A region of the DNA coding for the ORF of mRNA is also called ORF. 
[0091] The expression modulating fragment (hereinafter referred to as "EMF") is used herein to define a series of 
polynucleotide fragments which modulate the expression of the ORF or another sequence ligated operatably thereto. 
The expression "modulate the expression of a sequence ligated operatably" is used herein to refer to changes in the 
expression of a sequence due to the presence of the EMF. Examples of the EMF include a promoter, an operator, an 
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enhancer, a silencer, a ribosome-binding sequence, a transcriptional termination sequence, and the like. In coryneform 
bacteria, an EMF is usually present in an intergenic segment (a fragment positioned between two genes; about 10 to 
200 nucleotides in length). Accordingly, an EMF is frequently present in an intergenic segment of 10 nucleotides or 
longer. It is also possible to determine or discover the presence of an EMF by using known EMF sequences as a target 
sequence or a target structural motif (or a target motif) using an appropriate software or comparator, such as FASTA 
{Proc. Natl. Acad. Sci. USA, 85: 2444-48 (1988)), BLAST (J. Mol. Biol., 215: 403-410 (1990)) or the like. Also, it can 
be identified and evaluated using a known EMF-capturing vector (for example, pKK232-8; manufactured by Amersham 
Pharmacia Biotech). 

[0092] The term "target sequence" is used herein to refer to a nucleotide sequence composed of 6 or more nucle- 
otides, an amino acid sequence composed of 2 or more amino acids, or a nucleotide sequence encoding this amino 
acid sequence composed of 2 or more amino acids. A longer target sequence appears at random in a data base at 
the lower possibility. The target sequence is preferably about 10 to 100 amino acid residues or about 30 to 300 nucle- 
otide residues. 

[0093] The term "target structural motif or "target motif is used herein to refer to a sequence or a combination of 
sequences selected optionally and reasonably. Such a motif is selected on the basis of the threedimensional structure 
formed by the folding of a polypeptide by means known to one of ordinary skill in the art. Various motives are known. 
[0094] Examples of the target motif of a polypeptide include, but are not limited to, an enzyme activity site, a protein- 
protein interaction site, a signal sequence, and the like. Examples of the target motif of a nucleic acid include a promoter 
sequence, a transcriptional regulatory factor binding sequence, a hair pin structure, and the like. 
[0095] Examples of highly useful EMF include a high-expression promoter, an inducible-expression promoter, and 
the like. Such an EMF can be obtained by positionally determining the nucleotide sequence of a gene which is known 
or expected as achieving_high ex^ress[on (for example, rlbospmaLRNA_gene; GenBank-Accession -No-M-1-6-1-75 or 
Z46753)"or a gene showing a desired induction pattern (for example, isocitrate lyase gene induced by acetic acid: 
Japanese Published Unexamined Patent Application No. 56782/93) via the alignment with the full genome nucleotide 
sequence determined in the above item 1, and isolating the genome fragment in the upstream pari (usually 200 to 500 
nucleotides from the translation initiation site). It is also possible to obtain a highly useful EMF by selecting an EMF 
showing a high expression efficiency or a desired induction pattern from among promoters captured by the EMF- 
capturing vector as described above. 

[0096] The ORF can be identified by extracting characteristics common to individual ORFs, constructing a general 
model based on these characteristics, and measuring the conformity of the subject sequence with the model. In the 
identification, a software, such as GeneMark (Nuc. Acids. Res., 22: 4756-67 (1994): manufactured by GenePro)), 
GeneMark.hmm (manufactured by GenePro), GeneHacker (Protein, Nucleic Acid and Enzyme, 42: 3001-07 (1997)), 
Glimmer (Nuc. Acids. Res., 26: 544-548 (1998): manufactured by The Institute of Genomic Research), or the like, can 
be used. In using the software, the default (initial setting) parameters are usually used, though the parameters can be 
optionally changed. 

[0097] In the above-described comparisons, a computer, such as UNIX, PC, Macintosh, or the like, can be used. 
[0098] Examples of the ORF determined by the method of the present invention include ORFs having the nucleotide 
sequences represented by SEQ ID NOS:2 to 3501 present in the genome of Corynebacterium glutamicum as repre- 
sented by SEQ ID NO:1. In these ORFs, polypeptides having the amino acid sequences represented by SEQ ID NOS: 
3502 to 7001 are encoded. 

[0099] The function of an ORF can be determined by comparing the identified amino acid sequence of the ORF with 
known homologous sequences using a homology searching software or comparator, such as BLAST, FAST, Smith & 
Waterman (Meth. Enzym., 164: 765 (1988)) or the like on an amino acid data base, such as Swith-Prot, PIR, GenBank- 
nr-aa, GenPept constituted by protein-encoding domains derived from GenBank data base, OWL or the like. 
[0100] Furthermore, by the homology searching, the identity and similarity with the amino acid sequences of known 
proteins can also be analyzed. 

[0101] With respect of the term "identity" used herein, where two polypeptides each having 10 amino acids are 
different in the positions of 3 amino acids, these polypeptides have an identity of 70% with each other. In case wherein 
one of the different 3 amino acids is analogue (for example, leucine and isoleucine), these polypeptides have a similarity 
of 80%. 

[0102] As a specific example, Table 1 shows the registration numbers in known data bases of sequences which are 
judged as having the highest similarity with the nucleotide sequence of the ORF derived from Corynebacterium glutami- 
cum ATCC 13032, genes of these sequences, functions of these genes, and identities thereof compared with known 
amino acid translation sequences. 

[0103] Thus, a great number of novel genes derived from coryneform bacteria can be identified by determining the 
full nucleotide sequence of the genome derived from coryneform bacterium by the means of the present invention. 
Moreover, the function of the proteins encoded by these genes can be determined. Since coryneform bacteria are 
industrially highly useful microorganisms, many of the identified genes are industrially useful. 
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[0104] Moreover, the characteristics of respective microorganisms can be clarified by classifying the functions thus 
determined. As a result, valuable information in breeding is obtained. 

[0105] Furthermore, from the ORF information derived from coryneform bacteria, the ORF corresponding to the 
microorganism is prepared and obtained according to the general method as disclosed in Molecular Cloning, 2nd ed. 
or the like. Specifically, an oligonucleotide having a nucleotide sequence adjacent to the ORF is synthesized, and the 
ORF can be isolated and obtained using the oligonucleotide as a primer and a chromosome DNA derived from co- 
ryneform bacteria as a template according to the general PGR cloning technique. Thus obtained ORF sequences 
include polynucleotides comprising the nucleotide sequence represented by any one of SEQ ID NOS:2 to 3501. 
[0106] The ORF or primer can be prepared using a polypeptide synthesizer based on the above sequence informa- 
tion. 

[0107] Examples of the polynucleotide of the present invention include a polynucleotide containing the nucleotide 
sequence of the ORF obtained in the above, and a polynucleotide which hybridizes with the polynucleotide under 
stringent conditions. 

[0108] The polynucleotide of the present invention can be a single-stranded DNA, a double-stranded DNA and a 
single-stranded RNA, though it is not limited thereto. 

[0109] The polynucleotide which hybridizes with the polynucleotide containing the nucleotide sequence of the ORF 
obtained in the above under stringent conditions includes a degenerated mutant of the ORF. A degenerated mutant is 
a polynucleotide fragment having a nucleotide sequence which is different from the sequence of the ORF of the present 
invention which encodes the same amino acid sequence by degeneracy of a gene code. 

[0110] Specific examples include a polynucleotide comprising the nucleotide sequence represented by any one of 
SEQ ID NOS:2 to 3431, and a polynucleotide which hybridizes with the polynucleotide under stringent conditions. 
[011^1] A pol ynucleotid e which h ybridizes under_stringent conditions -is-a-polynucleatide-Qbtainsd-by-nnlnny-hyhrifih 
zation, plaque hybridization, Southern blot hybridization or the like using, as a probe, the polynucleotide having the 
nucleotide sequence of the ORF identified in the above. Specific examples include a polynucleotide which can be 
identified by carrying out hybridization at 65°C in the presence of 0.7-1 .0 M NaCI using a filter on which a Tpoly nucleotide 
prepared from colonies or plaques is immobilized, and then washing the filter with 0.1x to 2x SSC solution (the com- 
position of Ix SSC contains 150 mM sodium chloride and 15 mM sodium citrate) at 65°C. 

[0112] The hybridization can be carried out in accordance with known methods described in, for example, Molecular 
Cloning, 2nd ed., Current Protocols in Molecular Biology, DNA Cloning 1: Core Techniques, A Practical Approach, 
Second Edition, Oxford University (1995) or the like. Specific examples of the polynucleotide which can be hybridized 
include a DNA having a homology of 60% or more, preferably 80% or more, and particularly preferably 95% or more, 
with the nucleotide sequence represented by any one of SEQ ID NO:2 to 3431 when calculated using default (initial 
setting) parameters of a homology searching software, such as BLAST, FASTA, Smith-Waterman or the like. 
[0113] Also, the polynucleotide of the present invention includes a polynucleotide encoding a polypeptide comprising 
the amino acid sequence represented by any one of SEQ ID NOS:3502 to 6931 and a polynucleotide which hybridizes 
with the polynucleotide under stringent conditions. 

[0114] Furthermore, the polynucleotide of the present invention includes a polynucleotide which is present in the 5* 
upstream or 3' downstream region of a polynucleotide comprising the nucleotide sequence of any one of SEQ ID NOS: 
2 to 3431 in a polynucleotide comprising the nucleotide sequence represented by SEQ ID NO:1, and has an activity 
of regulating an expression of a polypeptide encoded by the polynucleotide. Specific examples of the polynucleotide 
having an activity of regulating an expression of a polypeptide encoded by the polynucleotide includes a polynucleotide 
encoding the above described EMF, such as a promoter, an operator, an enhancer, a silencer, a ribosome-binding 
sequence, a transcriptional termination sequence, and the like. 

[0115] The primer used for obtaining the ORF according to the above PCR cloning technique includes an oligonu- 
cleotide comprising a sequence which is the same as a sequence of 10 to 200 continuous nucleotides in the nucleotide 
sequence of the ORF and an adjacent region or an oligonucleotide comprising a sequence which is complementary 
to the oligonucleotide. Specific examples include an oligonucleotide comprising a sequence which is the same as a 
sequence of 10 to 200 continuous nucleotides of the nucleotide sequence represented by any one of SEQ ID NOS:1 
to 3431, and an oligonucleotide comprising a sequence complementary to the oligonucleotide comprising a sequence 
of at least 1 0 to 20 continuous nucleotide of any one of SEQ ID NOS: 1 to 3431 . When the primers are used as a sense 
primer and an antisense primer, the above-described oligonucleotides in which melting temperature (T m ) and the 
number of nucleotides are not significantly different from each other are preferred. 

[0116] The oligonucleotide of the present invention includes an oligonucleotide comprising a sequence which is the 
same as 10 to 200 continuous nucleotides of the nucleotide sequence represented by any one of SEQ ID NOS:1 to 
3431 or an oligonucleotide comprising a sequence complementary to the oligonucleotide. 

[01 17] Also, analogues of these oligonucleotides (hereinafter also referred to as "analogous oligonucleotides") are 
also provided by the present invention and are useful in the methods described herein. 

[0118] Examples of the analogous oligonucleotides include analogous oligonucleotides in which a phosphodiester 



EP 1 108 790 A2 



bond in an oligonucleotide is converted to a phosphorothioate bond, analogous oligonucleotides in which a phosphodi- 
ester bond in an oligonucleotide is converted to an NS'-PS* phosphoamidate bond, analogous oligonucleotides in which 
ribose and a phosphodiester bond in an oligonucleotide is converted to a peptide nucleic acid bond, analogous oligo- 
nucleotides in which uracil in an oligonucleotide is replaced with C-5 propynyluracil, analogous oligonucleotides in 
which uracil in an oligonucleotide is replaced with C-5 thiazoluracil, analogous oligonucleotides in which cytosine in 
an oligonucleotide is replaced with C-5 propynylcytosine, analogous oligonucleotides in which cytosine in an oligonu- 
cleotide is replaced with phenoxazine-modified cytosine, analogous oligonucleotides in which ribose in an oligonucle- 
otide is replaced with 2'-0-propylribose, analogous oligonucleotides in which ribose in an oligonucleotide is replaced 
with 2'-methoxyethoxyhbose, and the like (Cell Engineering, 16: 1463 (1997)). 

[0119] The above oligonucleotides and analogous oligonucleotides of the present invention can be used as probes 
for hybridization and antisense nucleic acids described below in addition to as primers. 

[0120] Examples of a primer for the antisense nucleic acid techniques known in the art include an oligonucleotide 
which hybridizes the oligonucleotide of the present invention under stringent conditions and has an activity regulating 
expression of the polypeptide encoded by the polynucleotide, in addition to the above oligonucleotide. 

3. Determination of isozymes 

[0121] Many mutants of coryneform bacteria which are useful in the production of useful substances, such as amino 
acids, nucleic acids, vitamins, saccharides, organic acids, and the like, are obtained by the present invention. 
[0122] However, since the gene sequence data of the microorganism has been, to date, insufficient, useful mutants 
have been obtained by mutagenic techniques using a mutagen, such as nitrosoguanidine (NTG) or the like. 
[0123] Although genes^can^e mutajtedj^andomjy by thejriuta.Qenic.method-Usinn.the above-described-rnutagen, all 
genes~encoding respective^isozymes having similar properties relating to the metabolism of intermediates cannot be 
mutated. In the mutagenic method using a mutagen, genes are mutated randomly. Accordingly, harmful mutations 
worsening culture characteristics,=such as delay in growth, accelerated foaming, arid the like, might be imparted at a 
great frequency, in a random manner. 

[0124] However, if gene sequence information is available, such as is provided by the present invention, it is possible 
to mutate all of the genes encoding target isozymes. In this case, harmful mutations may be avoided and the target 
mutation can be incorporated. 

[0125] Namely, an accurate number and sequence information of the target isozymes in coryneform bacteria can be 
obtained based on the ORF data obtained in the above item 2. By using the sequence information, all of the target 
isozyme genes can be mutated into genes having the desired properties by, for example, the site-specific mutagenesis 
method described in Molecular Cloning, 2nd ed. to obtain useful mutants having elevated productivity of useful sub- 
stances. 

4. Clarification or determination of biosynthesis pathway and signal transmission pathway 

[01 26] Attempts have been made to elucidate biosynthesis pathways and signal transmission pathways in a number 
of organisms, and many findings have been reported. However, there are many unknown aspects of coryneform bac- 
teria since a number of genes have not been identified so far. 
[0127] These unknown points can be clarified by the following method. 

[0128] The functional information of ORF derived from coryneform bacteria as identified by the method of above item 
2 is arranged. The term "arranged" means that the ORF is classified based on the biosynthesis pathway of a substance 
or the signal transmission pathway to which the ORF belongs using known information according to the functional 
information. Next, the arranged ORF sequence information is compared with enzymes on the biosynthesis pathways 
or signal transmission pathways of other known organisms. The resulting information is combined with known data on 
coryneform bacteria. Thus, the biosynthesis pathways and signal transmission pathways in coryneform bacteria, which 
have been unknown so far, can be determined. 

[0129] As a result that these pathways which have been unknown or unclear hitherto are clarified, a useful mutant 
for producing a target useful substance can be efficiently obtained. 

[0130] When the thus clarified pathway is judged as important in the synthesis of a useful product, a useful mutant 
can be obtained by selecting a mutant wherein this pathway has been strengthened. Also, when the thus clarified 
pathway is judged as not important in the biosynthesis of the target useful product, a useful mutant can be obtained 
by selecting a mutant wherein the utilization frequency of this pathway is lowered. 

5. Clarification or determination of useful mutation point 

[0131] Many useful mutants of coryneform bacteria which are suitable for the production of useful substances, such 
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as amino acids, nucleic acids, vitamins, saccharides, organic acids, and the like, have been obtained. However, it is 
hardly known which mutation point is imparted to a gene to improve the productivity. 

[0132] However, mutation points contained in production strains can be identified by comparing desired sequences 
of the genome DNA of the production strains obtained from coryneform bacteria by the mutagenic technique with the 
5 nucleotide sequences of the corresponding genome DNA and ORF derived from coryneform bacteria determined by 
the methods of the above items 1 and 2 and analyzing them 

[0133] Moreover, effective mutation points contributing to the production can be easily specified from among these 
mutation points on the basis of known information relating to the metabolic pathways, the metabolic regulatory mech- 
anisms, the structure activity correlation of enzymes, and the like. 
10 [0134] When any efficient mutation can be hardly specified based on known data, the mutation points thus identified 
can be introduced into a wild strain of coryneform bacteria or a production strain free of the mutation. Then, it is examined 
whether or not any positive effect can be achieved on the production. 

[0135] For example, by comparing the nucleotide sequence of homoserine dehydrogenase gene horn of a lysine- 
producing B-6 strain of Corynebacterium gtutamicum (Appt. Microbiol. Biotechnol., 32: 269-273 (1989)) with the nu- 
1 5 cleotide sequence corresponding to the genome of Corynebacterium glutamicum ATCC 1 3032 according to the present 
invention, a mutation of amino acid replacement in which valine at the 59-position is replaced with alanine (Val59Ala) 
was identified. A strain obtained by introducing this mutation into the ATCC 13032 strain by the gene replacement 
method can produce lysine, which indicates that this mutation is an effective mutation contributing to the production 
of lysine. 

20 [0136] Similarly, by comparing the nucleotide sequence of pyruvate carboxylase gene pyc of the B-6 strain with the 
nucleotide sequence corresponding to the ATCC 1 3032 genome, a mutation of amino acid replacement in which proline 
at the 458-position was replaced with serine (Pro458S er) was idejTtme^^strain.obtained-byJntroducing this-mutation 
into a iysine-producing strainof No. 58 (FERM BP-7134) of Corynebacterium glutamicum free of this mutation shows 
an improved lysine productivity in comparison with the No. 58 strain, which indicates that this mutation is an effective 

25 mutation contributing Jo the. production of lysine. — - - - ■=-- =- =■ ' ^ 

[0137] In addition, a mutation A1a213Thr in glucose-6-phosphate dehydrogenase was specified as an effective mu- 
tation relating to the production of lysine by detecting glucose-6-phosphate dehydrogenase gene zwf of the B-6 strain. 
[0138] Furthermore, the lysine-productivity of Corynebacterium glutamicum was improved by replacing the base at 
the 932-position of aspartokinase gene iysC of the Corynebacterium glutamicum ATCC 13032 genome with cytosine 

30 to thereby replace threonine at the 31 1 -position by isoleucine, which indicates that this mutation is an effective mutation 
contributing to the production of lysine. 

[01 39] Also, as another method to examine whether or not the identified mutation point is an effective mutation, there 
is a method in which the mutation possessed by the lysine-producing strain is returned to the sequence of a wild type 
strain by the gene replacement method and whether or not it has a negative influence on the lysine productivity. For 
35 example, when the amino acid replacement mutation Val59Ala possessed by horn of the lysine-producing B-6 strain 
was returned to a wild type amino acid sequence, the lysine productivity was lowered in comparison with the B-6 strain. 
Thus, it was found that this mutation is an effective mutation contributing to the production of lysine. 
[0140] Effective mutation points can be more efficiently and comprehensively extracted by combining, if needed, the 
DNA array analysis or proteome analysis described below. 

40 

6. Method of breeding industrially advantageous production strain 



[0141] It has been a general practice to construct production strains, which are used industrially in the fermentation 
production of the target useful substances, such as amino acids, nucleic acids, vitamins, saccharides, organic acids, 
and the like, by repeating mutagenesis and breeding based on random mutagenesis using mutagens, such as NTG 
or the like, and screening. 

[0142] In recent years, many examples of improved production strains have been made through the use of recom- 
binant DNA techniques. In breeding, however, most of the parent production strains to be improved are mutants ob- 
tained by a conventional mutagenic procedure (W. Leu chten berg er, Amino Acids - Technical Production and Use. In: 
Roehr (ed) Biotechnology, second edition, vol. 6, products of primary metabolism. VCH Verlagsgesellschaft mbH Wein- 
heim, P 465 (1996)). 

[0143] Although mutagenesis methods have largely contributed to the progress of the fermentation industry, they 
suffer from a serious problem of multiple, random introduction of mutations into every part of the chromosome. Since 
many mutations are accumulated in a single chromosome each time a strain is improved, a production strain obtained 
by the random mutation and selecting is generally inferior in properties (for example, showing poor growth, delayed 
consumption of saccharides, and poor resistance to stresses such as temperature and oxygen) to a wild type strain, 
which brings about troubles such as failing to establish a sufficiently elevated productivity, being frequently contami- 
nated with miscellaneous bacteria, requiring troublesome procedures in culture maintenance, and the like, and, in its 
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turn, elevating the production cost in practice. In addition, the improvement in the productivity is based on random 
mutations and thus the mechanism thereof is unclear. Therefore, it is very difficult to plan a rational breeding strategy 
for the subsequent improvement in the productivity. 

[0144] According to the present invention, effective mutation points contributing to the production can be efficiently 
specified from among many mutation points accumulated in the chromosome of a production strain which has been 
bred from coryneform bacteria and, therefore, a novel breeding method of assembling these effective mutations in the 
coryneform bacteria can be established. Thus, a useful production strain can be reconstructed. It is also possible to 
construct a useful production strain from a wild type strain. 
[0145] Specifically, a useful mutant can be constructed in the following manner. 

[0146] One of the mutation points is incorporated into a wild type strain of coryneform bacteria. Then, it is examined 
whether or not a positive effect is established on the production. When a positive effect is obtained, the mutation point 
is saved. When no effect is obtained, the mutation point is removed. Subsequently, only a strain having the effective 
mutation point is used as the parent strain, and the same procedure is repeated. In general, the effectiveness of a 
mutation positioned upstream cannot be clearly evaluated in some cases when there is a rate-determining point in the 
downstream of a biosynthesis pathway. It is therefore preferred to successively evaluate mutation points upward from 
downstream. 

[0147] By reconstituting effective mutations by the method as described above in a wild type strain or a strain which 
has a high growth speed or the same ability to consume saccharides as the wild type strain, it is possible to construct 
an industrially advantageous strain which is free of troubles in the previous methods as described above and to conduct 
fermentation production using such strains within a short time or at a higher temperature. 

[0148] For example, a lysine-producing mutant B-6 {Appl. Microbiol. Biotechnol., 32: 262-273 (1989)), which is ob- 
tained by multiple rounds of random mujag enesjs f rp^ 

enables lysine ferm^r7ta¥on to be performed at a temperature between 30 and 34°C but shows lowered growth and 
lysine productivity at a temperature exceeding 34°C. Therefore, the fermentation temperature should be maintained 
at 34°C orjower. In contrast thereto, the production strain described in the above item 5; which is obtained by recon- 
stituting effective mutations relating to lysine production, can achieve a productivity at 40 to 42°C equal or superior to 
the result obtained by culturing at 30 to 34°C. Therefore, this strain is industrially advantageous since it can save the 
load of cooling during the fermentation. 

[0149] When culture should be carried out at a high temperature exceeding 43°C, a production strain capable of 
conducting fermentation production at a high temperature exceeding 43°C can be obtained by reconstituting useful 
mutations in a microorganism belonging to the genus Corynebacterium which can grow at high temperature exceeding 
43°C. Examples of the microorganism capable of growing at a high temperature exceeding 43°C include Corynebac- 
terium thermoaminogenes, such as Corynebacterium thermoaminogenes FERM 9244, FERM 9245 FERM 9246 and 
FERM 9247. 

[0150] A strain having a further improved productivity of the target product can be obtained using the thus recon- 
structed strain as the parent strain and further breeding it using the conventional mutagenesis method, the gene am- 
plification method, the gene replacement method using the recombinant DNA technique, the transduction method or 
the cell fusion method. Accordingly, the microorganism of the present invention includes, but is not limited to, a mutant, 
a cell fusion strain, a transformant, a transductant or a recombinant strain constructed by using recombinant DNA 
techniques, so long as it is a producing strain obtained via the step of accumulating at least two effective mutations in 
a coryneform bacteria in the course of breeding. 

[0151] When a mutation point judged as being harmful to the growth or production is specified, on the other hand, 
it is examined whether or not the producing strain used at present contains the mutation point. When it has the mutation, 
it can be returned to the wild type gene and thus a further useful production strain can be bred. 
[0152] The breeding method as described above is applicable to microorganisms, other than coryneform bacteria, 
which have industrially advantageous properties (for example, microorganisms capable of quickly utilizing less expen- 
sive carbon sources, microorganisms capable of growing at higher temperatures). 

7. Production and utilization of polynucleotide array 

(1) Production of polynucleotide array 

[0153] A polynucleotide array can be produced using the polynucleotide or oligonucleotide of the present invention 
obtained in the above items 1 and 2. 

[01 54] Examples include a polynucleotide array comprising a solid support to which at least one of a polynucleotide 
comprising the nucleotide sequence represented by SEQ ID NOS:2 to 3501, a polynucleotide which hybridizes with 
the polynucleotide under stringent conditions, and a polynucleotide comprising 10 to 200 continuous nucleotides in 
the nucleotide sequence of the polynucleotide is adhered; and a polynucleotide array comprising a solid support to 
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which at least one of a polynucleotide encoding a polypeptide comprising the amino acid sequence represented by 
any one of SEQ ID NOS:3502 to 7001, a polynucleotide which hybridizes with the polynucleotide under stringent 
conditions, and a polynucleotide comprising 10 to 200 continuous bases in the nucleotide sequences of the polynu- 
cleotides is adhered. 

[0155] Polynucleotide arrays of the present invention include substrates known in the art, such as a DNA chip, a 

DNA microarray and a DNA macroarray, and the like, and comprises a solid support and plural polynucleotides or 

fragments thereof which are adhered to the surface of the solid support. 

[0156] Examples of the solid support include a glass plate, a nylon membrane, and the like. 

[0157] The polynucleotides or fragments thereof adhered to the surface of the solid support can be adhered to the 

surface of the solid support using the general technique for preparing arrays. Namely, a method in which they are 

adhered to a chemically surface-treated solid support, for example, to which a polycation such as polylysine or the like 

has been adhered (Nat. Genet, 21: 15-19 (1999)). The chemically surface-treated supports are commercially available 

and the commercially available solid product can be used as the solid support of the polynucleotide array according 

to the present invention. 

[0158] As the polynucleotides or oligonucleotides adhered to the solid support, the polynucleotides and oligonucle- 
otides of the present invention obtained in the above items 1 and 2 can be used. 

[01 59] The analysis described below can be efficiently performed by adhering the polynucleotides or oligonucleotides 
to the solid support at a high density, though a high fixation density is not always necessary. 

[0160] Apparatus for achieving a high fixation density, such as an arrayer robot or the like, is commercially available 
from Takara Shuzo (GMS417 Arrayer), and the commercially available product can be used. 

[0161] Also, the oligonucleotides of the present invention can be synthesized directly on the solid support by the 
photolithography method or the like (Nat. Genet, 21: 20-24 (1 99.9)). Jn.this.method.a -!in ker-having-a-protective group 
which can be removed by light irradiation is first adhered to a solid support, such as a slide glass or the like. Then, it 
is irradiated with light through a mask (a photolithograph mask) permeating light exclusively at a definite part of the 
adhesion part. Next, an oligonucleotide having a protective group which can be removed by light irradiation is added 
to the part. Thus, a ligation reaction with the nucleotide arises exclusively at the irradiated part. By repeating this 
procedure, oligonucleotides, each having a desired sequence, different from each other can be synthesized in respec- 
tive parts. Usually, the oligonucleotides to be synthesized have a length of 10 to 30 nucleotides. 

(2) Use of polynucleotide array 

[0162] The following procedures (a) and (b) can be carried out using the polynucleotide array prepared in the above 
(1). 

(a) Identification of mutation point of coryneform bacterium mutant and analysis of expression amount and expression 
profile of gene encoded by genome 

[0163] By subjecting a gene derived from a mutant of coryneform bacteria or an examined gene to the following 
steps (i) to (iv), the mutation point of the gene can be identified or the expression amount and expression profile of the 
gene can be analyzed: 

(i) producing a polynucleotide array by the method of the above (1); 

(ii) incubating polynucleotides immobilized on the polynucleotide array together with the labeled gene derived from 
a mutant of the coryneform bacterium using the polynucleotide array produced in the above (i) under hybridization 
conditions; 

(iii) detecting the hybridization; and 

(iv) analyzing the hybridization data. 

[0164] The gene derived from a mutant of coryneform bacteria or the examined gene include a gene relating to 
biosynthesis of at least one selected from amino acids, nucleic acids, vitamins, saccharides, organic acids, and ana- 
logues thereof. 

[0165] The method will be described in detail. 

[0166] A single nucleotide polymorphism (SNP) in a human region of 2,300 kb has been identified using polynucle- 
otide arrays (Science, 280: 1077-82 (1998)). In accordance with the method of identifying SNP and methods described 
in Science, 278: 680-686 (1997); Proc. Natl. Acad. Sci. USA, 96: 12833-38 (1999); Science, 284: 1520-23 (1999), and 
the like using the polynucleotide array produced in the above (1) and a nucleic acid molecule (DNA, RNA) derived from 
coryneform bacteria in the method of the hybridization, a mutation point of a useful mutant, which is useful in producing 
an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, or the like can be identified and the gene 
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expression amount and the expression profile thereof can be analyzed. 

[0167] The nucleic acid molecule (DNA, RNA) derived from the coryneform bacteria can be obtained according to 
the general method described in Molecular Cloning, 2nd ed. or the like. mRNA derived from Corynebacterium glutami- 
cum can also be obtained by the method of Bormann et al. (Molecular Microbiology, 6: 317-326 (1992)) or the like. 
[0168] Although ribosomal RNA (rRNA) is usually obtained in large excess in addition to the target mRNA, the anal- 
ysis is not seriously disturbed thereby. 

[0169] The resulting nucleic acid molecule derived from coryneform bacteria is labeled. Labeling can be carried out 
according to a method using a fluorescent dye, a method using a radioisotope or the like. 

[0170] Specific examples include a labeling method in which psoralen-biotin is crosslinked with RNA extracted from 
a microorganism and, after hybridization reaction, a fluorescent dye having streptoavidin bound thereto is bound to 
the biotin moiety {Nat Biotechnol., 16: 45-48 (1998)); a labeling method in which a reverse transcription reaction is 
carried out using RNA extracted from a microorganism as a template and random primers as primers, and dUTP having 
a fluorescent dye (for example, Cy3, Cy5) (manufactured by Amersham Pharmacia Biotech) is incorporated into cDNA 
{Proc. Natl. Acad. Sci. USA, 96: 12833-38 (1999)); and the like. 

[0171] The labeling specificity can be improved by replacing the random primers by sequences complementary to 
the 3-end of ORF (J. BacterioL, 181: 6425-40 (1999)). 

[0172] In the hybridization method, the hybridization and subsequent washing can be carried out by the general 
method (Nat Bioctechnol., 14: 1675-80 (1996), or the like). 

[0173] Subsequently, the hybridization intensity is measured depending on the hybridization amount of the nucleic 
acid molecule used in the labeling. Thus, the mutation point can be identified and the expression amount of the gene 
can be calculated. 

[0174] The hybridization intensity canbe measured b y visua lizing the fluorescent.signal.radioactivity, luminescence 
dose, and theiike, using a laser confocal microscope, a CCD camera, a radiation imaging device (for example, STORM 
manufactured by Amersham Pharmacia Biotech), and the like, and then quantifying the thus visualized data. 
[0175] A polynucleotide.array on a solid support can also be analyzed and quantified using a commercially available 
apparatus, such as GMS418 Array Scanner (manufactured by Takara Shuzo) or the like. 

[0176] The gene expression amount can be analyzed using a commercially available software (for example, ImaGene 
manufactured by Takara Shuzo; Array Gauge manufactured by Fuji Photo Film; ImageQuant manufactured by Amer- 
sham Pharmacia Biotech, or the like). 

[0177] A fluctuation in the expression amount of a specific gene can be monitored using a nucleic acid molecule 
obtained in the time course of culture as the nucleic acid molecule derived from coryneform bacteria. The culture 
conditions can be optimized by analyzing the fluctuation. 

[0178] The expression profile of the microorganism at the total gene level (namely, which genes among a great 
number of genes encoded by the genome have been expressed and the expression ratio thereof) can be determined 
using a nucleic acid molecule having the sequences of many genes determined from the full genome sequence of the 
microorganism. Thus, the expression amount of the genes determined by the full genome sequence can be analyzed 
and, in its turn, the biological conditions of the microorganism can be recognized as the expression pattern at the full 
gene level. 

(b) Confirmation of the presence of gene homologous to examined gene in coryneform bacteria 

[0179] Whether or not a gene homologous to the examined gene, which is present in an organism other than co- 
ryneform bacteria, is present in coryneform bacteria can be detected using the polynucleotide array prepared in the 
above (1). 

[0180] This detection can be carried out by a method in which an examined gene which is present in an organism 
other than coryneform bacteria is used instead of the nucleic acid molecule derived from coryneform bacteria used in 
the above identification/analysis method of (1). 

8. Recording medium storing full genome nucleotide sequence and ORF data and being readable by a computer and 
methods for using the same 

[0181] The term "recording medium or storage device which is readable by a computer" means a recording medium 
or storage medium which can be directly readout and accessed with a computer. Examples include magnetic recording 
media, such as a floppy disk, a hard disk, a magnetic tape, and the like; optical recording media, such as CD-ROM, 
CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, and the like; electric recording media, such as RAM, ROM, and the 
like; and hybrids in these categories (for example, magnetic/optical recording media, such as MO and the like). 
[0182] Instruments for recording or inputting in or on the recording medium or instruments or devices for reading out 
the information in the recording medium can be appropriately selected, depending on the type of the recording medium 
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and the access device utilized. Also, various data processing programs, software, comparator and formats are used 
for recording and utilizing the polynucleotide sequence information or the like, of the present invention in the recording 
medium. The information can be expressed in the form of a binary file, a text file or an ASCII file formatted with com- 
mercially available software, for example. Moreover, software for accessing the sequence information is available and 
known to one of ordinary skill in the art. 

[01 83] Examples of the information to be recorded in the above-described medium include the full genome nucleotide 
sequence information of coryneform bacteria as obtained in the above item 2, the nucleotide sequence information of 
ORF, the amino acid sequence information encoded by the ORF, and the functional information of polynucleotides 
coding for the amino acid sequences. 

[0184] The recording medium or storage device which is readable by a computer according to the present invention 
refers to a medium in which the information of the present invention has been recorded. Examples include recording 
media or storage devices which are readable by a computer storing the nucleotide sequence information represented 
by SEQ ID NOS:1 to 3501, the amino acid sequence information represented by SEQ ID NOS:3502 to 7001, the 
functional information of the nucleotide sequences represented by SEQ ID NOS:1 to 3501, the functional information 
of the amino acid sequences represented by SEQ ID NOS:3502 to 7001, and the information listed in Table 1 below 
and the like. 

9. System based on a computer using the recording medium of the present invention which is readable by a computer 

[0185] The term "system based on a computer" as used herein refers a system composed of hardware device(s), 
software device(s), and data recording device(s) which are used for analyzing the data recorded in the recording me- 
dium of the present invention whic h is readable^bv.a computer. 

[0186] The hardware device(s) are, for example, composed of an input unit, a data recording unit, a central processing 
unit and an output unit collectively or individually. 
_[0187]=By the software device(s), the data recorded in the recording medium of the present invention are searched 
or analyzed using the recorded data and the hardware device(s) as described herein. Specifically, the software device 
(s) contain at least one program which acts on or with the system in order to screen, analyze or compare biologically 
meaningful structures or information from the nucleotide sequences, amino acid sequences and the like recorded in 
the recording medium according to the present invention. 

[0188] Examples of the software device(s) for identifying ORF and EMF domains include GeneMark (Nuc. Acids. 
Res., 22: 4756-67 (1994)), GeneHacker (Protein, Nucleic Acid and Enzyme, 42: 3001-07 (1997)), Glimmer (The Insti- 
tute of Genomic Research; Nuc. Acids. Res., 26: 544-548 (1 998)) and the like. In the process of using such a software 
device, the default (initial setting) parameters are usually used, although the parameters can be changed, if necessary, 
in a manner known to one of ordinary skill in the art. 

[0189] Examples of the software device(s) for identifying a genome domain or a polypeptide domain analogous to 
the target sequence or the target structural motif (homology searching) include FASTA, BLAST, Smith-Waterman, 
GenetyxMac (manufactured by Software Development), GCG Package (manufactured by Genetic Computer Group), 
GenCore (manufactured by Compugen), and the like. In the process of using such a software device, the default (initial 
setting) parameters are usually used, although the parameters can be changed, if necessary, in a manner known to 
one of ordinary skill in the art. 

[0190] Such a recording medium storing the full genome sequence data is useful in preparing a polynucleotide array 
by which the expression amount of a gene encoded by the genome DNA of coryneform bacteria and the expression 
profile at the total gene level of the microorganism, namely, which genes among many genes encoded by the genome 
have been expressed and the expression ratio thereof, can be determined. 

[0191] The data recording device(s) provided by the present invention are, for example, memory device(s) for re- 
cording the data recorded in the recording medium of the present invention and target sequence or target structural 
motif data, or the like, and a memory accessing device(s) for accessing the same. 

[0192] Namely, the system based on a computer according to the present invention comprises the following: 

(i) a user input device that inputs the information stored in the recording medium of the present invention, and 
target sequence or target structure motif information; 

(ii) a data storage device for at least temporarily storing the input information; 

(iii) a comparator that compares the information stored in the recording medium of the present invention with the 
target sequence or target structure motif information, recorded by the data storing device of (ii) for screening and 
analyzing nucleotide sequence information which is coincident with or analogous to the target sequence or target 
structure motif information; and 

(iv) an output device that shows a screening or analyzing result obtained by the comparator. 
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[0193] This system is usable in the methods in items 2 to 5 as described above for searching and analyzing the ORF 
and EMF domains, target sequence, target structural motif, etc. of a coryneform bacterium, searching homologs, 
searching and analyzing isozymes, determining the biosynthesis pathway and the signal transmission pathway, and 
identifying spots which have been found in the proteome analysis. The term "homologs" as used herein includes both 
of orthologs and paralogs. 

10. Production of polypeptide using ORF derived from coryneform bacteria 

[0194] The polypeptide of the present invention can be produced using a polynucleotide comprising the ORF obtained 
in the above item 2. Specifically, the polypeptide of the present invention can be produced by expressing the polynu- 
cleotide of the present invention or a fragment thereof in a host cell, using the method described in Molecular Cloning, 
2nd ed., Current Protocols in Molecular Biology, and the like, for example, according to the following method. 
[0195] A DNA fragment having a suitable length containing a part encoding the polypeptide is prepared from the full 
length ORF sequence, if necessary. 

[0196] Also, DNA in which nucleotides in a nucleotide sequence at a part encoding the polypeptide of the present 
invention are replaced to give a codon suitable for expression of the host cell, if necessary. The DNA is useful for 
efficiently producing the polypeptide of the present invention. 

[0197] A recombinant vector is prepared by inserting the DNA fragment into the downstream of a promoter in a 
suitable expression vector. 

[0198] The recombinant vector is introduced to a host cell suitable for the expression vector. 

[0199] Any of bacteria, yeasts, animal cells, insect cells, plant cells, and the like can be used as the host cell so long 

as it can be expressed in the gene of interest. 

[0200] Examples of the expression vector include those which can replicate autonomously in the above-described 
host cell or can be integrated into chromosome and have a promoter at such a position that the DNA encoding the 
polypeptide of the present invention can be transcribed. • - ^ ~ 

[0201] When a procaryote cell, such as a bacterium or the like, is used as the host cell, it is preferred that the 
recombinant vector containing the DNA encoding the polypeptide of the present invention can replicate autonomously 
in the bacterium and is a recombinant vector constituted by, at least a promoter, a ribosome binding sequence, the 
DNA of the present invention and a transcription termination sequence. A promoter controlling gene can also be con- 
tained therewith in operable combination. 

[0202] Examples of the expression vectors include a vector plasmid which is replicable in Corynebacterium glutami- 
cum, such as pCGI (Japanese Published Unexamined Patent Application No. 134500/82), pCG2 (Japanese Published 
Unexamined Patent Application No. 35197/83), pCG4 (Japanese Published Unexamined Patent Application No. 
183799/82), pCG11 (Japanese Published Unexamined Patent Application No. 134500/82), pCG116, pCE54 and 
pCB101 (Japanese Published Unexamined Patent Application No. 105999/83), pCE51, pCE52 and pCE53 (Mol. Gen. 
Genet, 196: 175-178 (1984)), and the like; a vector plasmid which is replicable in Escherichia coli, such as pET3 and 
pET11 (manufactured by Stratagene), pBAD, pThioHis and pTrcHis (manufactured by Invitrogen), pKK223-3 and 
pGEX2T (manufactured by Amersham Pharmacia Biotech), and the like; and pBTrp2, pBTad and pBTac2 (manufac- 
tured by Boehringer Mannheim Co.), pSE280 (manufactured by Invitrogen), pGEMEX-1 (manufactured by Promega), 
pQE-8 (manufactured by QIAGEN), pKYP10 (Japanese Published Unexamined Patent Application No. 110600/83)! 
PKYP200 (Agric. Biol. Chem., 48: 669 (1984)), pLSA1 (Agric. Biol. Chem., 53: 277 (1989)), pGEL1 (Proc. Natl. Acad. 
Sci. USA, 82: 4306 (1985)), pBluescript II SK(-) (manufactured by Stratagene), pTrs30 (prepared from Escherichia coli 
JM109/pTrS30 (FERM BP-5407)), pTrs32 (prepared from Escherichia coli JM109/pTrS32 (FERM BP-5408)), pGHA2 
(prepared from Escherichia coli IGHA2 (FERM B-400), Japanese Published Unexamined Patent Application No. 
221091/85), pGKA2 (prepared from Escherichia coli IGKA2 (FERM BP-6798), Japanese Published Unexamined Patent 
Application No. 221091/85), pTerm2 (U.S. Patents 4,686,191, 4,939,094 and 5,160,735), pSupex, pUB110, pTP5, 
pC194 and pEG400 (J. Bacterid, 172: 2392 (1990)), pGEX (manufactured by Pharmacia), pET system (manufactured 
by Novagen), and the like. 

[0203] Any promoter can be used so long as it can function in the host cell. Examples include promoters derived 
from Escherichia coli, phage and the like, such as trp promoter (P trp ) , lac promoter, P L promoter, P R promoter, T7 
promoter and the like. Also, artificially designed and modified promoters, such as a promoter in which two Ptrp are 
linked in series (P +rp x2) , tac promoter, /acT7 promoter leti promoter and the like, can be used. 
[0204] It is preferred to use a plasmid in which the space between Shine-Dalgarno sequence which is the ribosome 
binding sequence and the initiation codon is adjusted to an appropriate distance (for example, 6 to 18 nucleotides). 
[0205] The transcription termination sequence is not always necessary for the expression of the DNA of the present 
invention. However, it is preferred to arrange the transcription terminating sequence at just downstream of the structural 
gene. 

[0206] One of ordinary skill in the art will appreciate that the codons of the above-described elements may be opti- 
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mized, in a known manner, depending on the host cells and environmental conditions utilized. 
[0207] Examples of the host cell include microorganisms belonging to the genus Escherichia, the genus Serratia, 
the genus Bacillus, the genus Brevibacterium, the genus Corynebacterium, the genus Microbacterium, the genus Pseu- 
domonas, and the like. Specific examples include Escherichia coii XL1-Blue, Escherichia colt XL2-Blue, Escherichia 
coli DM, Escherichia coli MC1000, Escherichia coli KY3276, Escherichia coli W1485, Escherichia coii JM109, Es- 
cherichia coh HB101, Escherichia coli No. 49, Escherichia coli W3110, Escherichia coli NY49, Escherichia coli GI698, 
Escherichia coli TB1, Serratia ficaria, Serratia fonticola, Serratia liquefaciens, Serratia marcescens, Bacillus subtilis, 
Bacillus amylotiquefaciens, Corynebacterium ammonia genes, Brevibacterium immariophilum ATCC 14068, Brevibac- 
terium saccharolyticum ATCC 1 4066, Corynebacterium glutamicum ATCC 1 3032, Corynebacterium glutamicum ATCC 
13869, Corynebacterium glutamicum ATCC 14067 (prior gen us and species: Brevibacterium flavum), Corynebacterium 
glutamicum ATCC 13869 (prior genus and species: Brevibacterium lactofermentum, or Corynebacterium lactofermen- 
tum), Corynebacterium acetoacidophilum ATCC 13870, Corynebacterium thermoaminogenes FERM 9244, Microbac- 
terium ammoniaphiium ATCC 15354, Pseudomonas putida, Pseudomonas sp. D-0110, and the like. 
[0208] When Corynebacterium glutamicum or an analogous microorganism is used as a host, an EMF necessary 
for expressing the polypeptide is not always contained in the vector so long as the polynucleotide of the present in- 
vention contains an EMF. When the EMF is not contained in the polynucleotide, it is necessary to prepare the EMF 
separately and ligate it so as to be in operable combination. Also, when a higher expression amount or specific ex- 
pression regulation is necessary, it is necessary to ligate the EMF corresponding thereto so as to put the EMF in 
operable combination with the polynucleotide. Examples of using an externally ligated EMF are disclosed in Microbi- 
ology, 142: 1297-1309(1996). 

[0209] With regard to the method for the introduction of the recombinant vector, any method for introducing DNA into 
the above-described host cells, such as a method in which a calciu m \on ±s_used JProc.^Nati Acad. Sci. USA r 69-2AAG 
(1972)), a proiopiasr method (Japanese PubTished Unexamined Patent Application No. 2483942/88), the methods 
described in Gene, 17: 107 (1982) and Molecular^ General Genetics, 168: 111 (1979) and the like, can be used. 
flL? 1 J?] JWh§n yeast is__used as Jha host cell, examples oMhe expression vector include pYES2 (manufactured by 
Invitrogen), YEp13 (ATCC 37115), YEp24 (ATCC 37051), YCp50 (ATCC 37419), pHS19, pHS15, and the like. 
[0211] Any promoter can be used so long as it can be expressed in yeast. Examples include a promoter of a gene 
in the glycolytic pathway, such as hexose kinase and the like, PH05 promoter, PGK promoter, GAP promoter, ADH 
promoter, gal 1 promoter, gal 10 promoter, a heat shock protein promoter, MF al promoter, CUP 1 promoter, and the like. 
[0212] Examples of the host cell include microorganisms belonging to the genus Saccharomyces, the genus 
Schizosaccharomyces, the genus Kluyveromyces, the genus Trichosporon, the genus Schwanniomyces, the genus 
Pichia, the genus Candida and the like. Specific examples include Saccharomyces cerevisiae, Schizosaccharomyces 
pombe, Kluyveromyces lactis, Trichosporon pullulans, Schwanniomyces alluvius, Candida utilis and the like. 
[021 3] With regard to the method for the introduction of the recombinant vector, any method for introducing DNA into 
yeast, such as an electroporation method (Methods. EnzymoL, 194: 182 (1990)), a spheroplast method (Proc. Natl. 
Acad. Sci. USA, 75: 1929 (1978)), a lithium acetate method (J. Bacteriol., 153: 163 (1983)), a method described in 
Proc. Natl. Acad. Sci. USA, 75: 1929 (1978) and the like, can be used. 

[0214] When animal cells are used as the host cells, examples of the expression vector include pcDNA3.1 , pSinRep5 
and pCEP4 (manufactured by Invitorogen), pRev-Tre (manufactured by Clontech), pAxCAwt (manufactured by Takara 
Shuzo), pcDNAI and pcDM8 (manufactured by Funakoshi), pAGE107 (Japanese Published Unexamined Patent Ap- 
plication No. 22979/91; Cytotechnology, 3:133 (1990)), pAS3-3 (Japanese Published Unexamined Patent Application 
No. 227075/90), pcDM8 (Nature, 329: 840 (1987)), pcDNAI/Amp (manufactured by Invitrogen), pREP4 (manufactured 
by Invitrogen), pAGE103 (J. Biochem,, 101: 1307 (1987)), pAGE210, and the like. 

[0215] Any promoter can be used so long as it can function in animal cells. Examples include a promoter of IE 
(immediate early) gene of cytomegalovirus (CMV), an early promoter of SV40, a promoter of retrovirus, a metal- 
lothionein promoter, a heat shock promoter, SRa promoter, and the like. Also, the enhancer of the IE gene of human 
CMV can be used together with the promoter. 

[0216] Examples of the host cell include human Namalwa cell, monkey COS cell, Chinese hamster CHO cell, 
HST5637 (Japanese Published Unexamined Patent Application No. 299/88), and the like. 

[0217] The method for introduction of the recombinant vector into animal cells is not particularly limited, so long as 
it is the general method for introducing DNA into animal cells, such as an electroporation method {Cytotechnology, 3: 
133 (1990)), a calcium phosphate method (Japanese Published Unexamined Patent Application No. 227075/90), a 
lipofection method (Proc. Natl. Acad. Sci. USA, 84, 7413 (1987)), the method described in Virology, 52: 456 (1973), 
and the like. 

[0218] When insect cells are used as the host cells, the polypeptide can be expressed, for example, by the method 
described in Bacurovirus Expression Vectors, A Laboratory Manual, W.H. Freeman and Company, New York (1992), 
Biol Technology, 6: 47 (1988), or the like. 

[0219] Specifically, a recombinant gene transfer vector and bacurovirus are simultaneously inserted into insect cells 
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to obtain a recombinant virus in an insect cell culture supernatant, and then the insect cells are infected with the resulting 
recombinant virus to express the polypeptide. 

[0220] Examples of the gene introducing vector used in the method include pBlueBac4.5, pVL1392, pVL1393 and 
pBlueBaclll (manufactured by Invitrogen), and the like. 

[0221] Examples of the bacurovirus include Autographa californica nuclear polyhidrosis virus with which insects of 
the family Barathra are infected, and the like. 

[0222] Examples of the insect cells include Spodoptera frugiperda oocytes Sf9 and Sf21 (Bacurovirus Expression 
Vectors, A Laboratory Manual, W.H. Freeman and Company, New York (1992)), Trichoplusia ni oocyte High 5 (manu- 
factured by Invitrogen) and the tike. 

[0223] The method for simultaneously incorporating the above-described recombinant gene transfer vector and the 
above-described bacurovirus for the preparation of the recombinant virus include calcium phosphate method (Japanese 
Published Unexamined Patent Application No. 227075/90), lipofection method {Proc. Nati. Acad. Sci. USA, 84: 7413 
(1987)) and the like. 

[0224] When plant cells are used as the host cells, examples of expression vector include a Ti plasmid, a tobacco 
mosaic virus vector, and the like. 

[0225] Any promoter can be used so long as it can be expressed in plant cells. Examples include 35S promoter of 
cauliflower mosaic virus (CaMV), rice actin 1 promoter, and the like. 

[0226] Examples of the host cells include plant cells and the like, such as tobacco, potato, tomato, carrot, soybean, 
rape, alfalfa, rice, wheat, barley, and the like. 

[0227] The method for introducing the recombinant vector is not particularly limited, so long as it is the general method 
for introducing DNA into plant cells, such as the Agrobacterium method (Japanese Published Unexamined Patent 
Application No. 1 40885/84, Japanese Published I lJnej^m|ned^£atem 

electroporation method (Japanese Published Unexamined Patent Application No. 251887/85), the particle gun method 
(Japanese Patents 2606856 and 2517813), and the like. 

[0228] The transformant of the presentinvention-includesa transformant containing the polypeptide of the present 
invention perse rather than as a recombinant vector, that is, a transformant containing the polypeptide of the present 
invention which is integrated into a chromosome of the host, in addition to the transformant containing the above 
recombinant vector. 

[0229] When expressed in yeasts, animal cells, insect cells or plant cells, a glycopolypeptide or glycosylated polypep- 
tide can be obtained. 

[0230] The polypeptide can be produced by culturing the thus obtained transformant of the present invention in a 
culture medium to produce and accumulate the polypeptide of the present invention or any polypeptide expressed 
under the control of an EMF of the present invention, and recovering the polypeptide from the culture. 
[0231] Culturing of the transformant of the present invention in a culture medium is carried out according to the 
conventional method as used in culturing of the host. 

[0232] When the transformant of the present invention is obtained using a prokaryote, such as Escherichia coli or 
the like, or a eukaryote, such as yeast or the like, as the host, the transformant is cultured. 

[0233] Any of a natural medium and a synthetic medium can be used, so long as it contains a carbon source, a 
nitrogen source, an inorganic salt and the like which can be assimilated by the transformant and can perform culturing 
of the transformant efficiently. 

[0234] Examples of the carbon source include those which can be assimilated by the transformant, such as carbo- 
hydrates (for example, glucose, fructose, sucrose, molasses containing them, starch, starch hydrolysate, and the like), 
organic acids (for example, acetic acid, propionic acid, and the like), and alcohols (for example, ethanol, propanol, and 
the like). 

[0235] Examples of the nitrogen source include ammonia, various ammonium salts of inorganic acids or organic 
acids (for example, ammonium chloride, ammonium sulfate, ammonium acetate, ammonium phosphate, and the like), 
other nitrogen-containing compounds, peptone, meat extract, yeast extract, corn steep liquor, casein hydrolysate, soy- 
bean meal and soybean meal hydrolysate, various fermented cells and hydrolysates thereof, and the like. 
[0236] Examples of inorganic salt include potassium dihydrogen phosphate, dipotassium hydrogen phosphate, mag- 
nesium phosphate, magnesium sulfate, sodium chloride, ferrous sulfate, manganese sulfate, copper sulfate, calcium 
carbonate, and the like. 

[0237] The culturing is carried out under aerobic conditions by shaking culture, submerged-aeration stirring culture 
or the like. The culturing temperature is preferably from 15 to 40°C, and the culturing time is generally from 16 hours 
to 7 days. The pH of the medium is preferably maintained at 3.0 to 9.0 during the culturing. The pH can be adjusted 
using an inorganic or organic acid, an alkali solution, urea, calcium carbonate, ammonia, or the like. 
[0238] Also, antibiotics, such as ampicillin, tetracycline, and the like, can be added to the medium during the culturing, 
if necessary. 

[0239] When a microorganism transformed with a recombinant vector containing an inducible promoter is cultured, 
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an inducer can be added to the medium, if necessary. 

[0240] For example, isopropyl-(3-D-thiogalactopyranoside (IPTG) or the like can be added to the medium when a 
microorganism transformed with a recombinant vector containing lac promoter is cultured, or indoleacrylic acid (IAA) 
or the like can by added thereto when a microorganism transformed with an expression vector containing trp promoter 
is cultured. 

[0241] Examples of the medium used in culturing a transformant obtained using animal cells as the host cells include 
RPMI 1640 medium (The Journal of the American Medical Association, 199: 519 (1967)), Eagle's MEM medium (Sci- 
ence, 122: 501 (1952)), Dulbecco's modified MEM medium (Virology, 8, 396 (1959)), 199 Medium (Proceeding of the 
Society for the Biological Medicine, 73:1 (1950)), the above-described media to which fetal calf serum has been added, 
and the like. 

[0242] The culturing is carried out generally at a pH of 6 to 8 and a temperature of 30 to 40°C in the presence of 5% 
C0 2 for 1 to 7 days. 

[0243] Also, if necessary, antibiotics, such as kanamycin, penicillin, and the like, can be added to the medium during 
the culturing. 

[0244] Examples of the medium used in culturing a transformant obtained using insect cells as the host cells include 
TNM-FH medium (manufactured by Pharmingen), Sf-900 II SFM (manufactured by Life Technologies), ExCell 400 and 
ExCell 405 (manufactured by JRH Biosciences), Grace's Insect Medium (Nature, 195: 788 (1962)), and the like. 
[0245] The culturing is carried out generally at a pH of 6 to 7 and a temperature of 25 to 30°C for 1 to 5 days. 
[0246] Additionally, antibiotics, such as gentamicin and the like, can be added to the medium during the culturing, if 
necessary. 

[0247] A transformant obtained by using a plant cell as the host cell can be used as the cell or after differentiating 
*? iLgfoJiig 11 °L^ a Q--§ xa JI | P |es of t ne medium used in the.cultijring of.the-trancfnrrriant-innitifip; -Muraahige-gnd-Skoog- 
(MS) medium, White medium, media to which a plant hormone, such as auxin, cytokinine, or the like has been added, 
and the like. 

[0248] ^The culturing^is carried out generally at a pH of 5 to 9 and a temperature of 20 to 40PC for 3 to 60days7 
[0249] Also, antibiotics, such as kanamycin, hygromycin and the like, can be added to the medium during the cul- 
turing, if necessary. 

[0250] As described above, the polypeptide can be produced by culturing a transformant derived from a microor- 
ganism, animal cell or plant cell containing a recombinant vector to which a DNA encoding the polypeptide of the 
present invention has been inserted according to the general culturing method to produce and accumulate the polypep- 
tide, and recovering the polypeptide from the culture. 

[0251] The process of gene expression may include secretion of the encoded protein production or fusion protein 
expression and the like in accordance with the methods described in Molecular Cloning, 2nd ed., in addition to direct 
expression. 

[0252] The method for producing the polypeptide of the present invention includes a method of intracellular expres- 
sion in a host cell, a method of extracellular secretion from a host cell, or a method of production on a host cell membrane 
outer envelope. The method can be selected by changing the host cell employed or the structure of the polypeptide 
produced. 

[0253] When the polypeptide of the present invention is produced in a host cell or on a host cell membrane outer 
envelope, the polypeptide can be positively secreted extracellularly according to, for example, the method of Paulson 
et al. (J. Biol. Chem., 264: 17619 (1989)), the method of Lowe et al. (Proc. Natl. Acad. Sci. USA, 86: 8227 (1989); 
Genes Develop., 4: 1288 (1990)), and/or the methods described in Japanese Published Unexamined Patent Application 
No. 336963/93, WO 94/23021, and the like. 

[0254] Specifically, the polypeptide of the present invention can be positively secreted extracellularly by expressing 
it in the form that a signal peptide has been added to the foreground of a polypeptide containing an active site of the 
polypeptide of the present invention according to the recombinant DNA technique. 

[0255] Furthermore, the amount produced can be increased using a gene amplification system, such as by use of 
a dihydrofolate reductase gene or the like according to the method described in Japanese Published Unexamined 
Patent Application No. 227075/90. 

[0256] Moreover, the polypeptide of the present invention can be produced by a transgenic animal individual (trans- 
genic nonhuman animal) or plant individual (transgenic plant). 

[0257] When the transformant is the animal individual or plant individual, the polypeptide of the present invention 
can be produced by breeding or cultivating it so as to produce and accumulate the polypeptide, and recovering the 
polypeptide from the animal individual or plant individual. 

[0258] Examples of the method for producing the polypeptide of the present invention using the animal individual 
include a method for producing the polypeptide of the present invention in an animal developed by inserting a gene 
according to methods known to those of ordinary skill in the art (American Journal of Clinical Nutrition, 63: 639S (1 996), 
American Journal of Clinical Nutrition, 63: 627S (1996), Biol Technology, 9: 830 (1991)). 
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[0259] In the animal individual, the polypeptide can be produced by breeding a transgenic nonhuman animal to which 
the DNA encoding the polypeptide of the present invention has been inserted to produce and accumulate the polypep- 
tide in the animal, and recovering the polypeptide from the animal. Examples of the production and accumulation place 
in the animal include milk (Japanese Published Unexamined Patent Application No. 309192/88), egg and the like of 
5 the animal. Any promoter can be used, so long as it can be expressed in the animal. Suitable examples include an a- 
casein promoter, a (p-casein promoter, a p-lactoglobulin promoter, a whey acidic protein promoter, and the like, which 
are specific for mammary glandular cells. 

[0260] Examples of the method for producing the polypeptide of the present invention using the plant individual 
include a method for producing the polypeptide of the present invention by cultivating a transgenic plant to which the 
10 DNA encoding the protein of the present invention by a known method {Tissue Culture, 20 (1 994), Tissue Culture, 21 
(1994), Trends in Biotechnology, 75:45 (1997)) to produce and accumulate the polypeptide in the plant, and recovering 
the polypeptide from the plant. 

[0261] The polypeptide according to the present invention can also be obtained by translation in vitro. 

[0262] The polypeptide of the present invention can be produced by a translation system in vitro. There are, for 

is example, two in vitro translation methods which may be used, namely, a method using RNA as a template and another 
method using DNA as a template. The template RNA includes the whole RNA, mRNA, an in vitro transcription product, 
and the like. The template DNA includes a plasmid containing a transcriptional promoter and a target gene integrated 
therein and downstream of the initiation site, a PCR/RT-PCR product and the like. To select the most suitable system 
for the in vitro translation, the origin of the gene encoding the protein to be synthesized (prokaryotic cell/eucaryotic 

20 cell), the type of the template (DNA/RNA), the purpose of using the synthesized protein and the like should be consid- 
ered. In vitro translation kits having various characteristics are commercially available from many companies (Boe- 
hringerMmnheim^Promega^Stratagene. or-the-nke), and even/ kitcan be used-in producing the poiypepiide according 
to the present invention. 

[0263] Transcription/translation of a DNA nucleotide sequence cloned into a plasmid containing a T7 promoter can 
25 be carried out using an in vitro transcription/translation system E. coli T7 S30 Extract System for Circular DNA (man- 
ufactured by Promega, catalogue No. L1130). Also, transcription/translation using, as a template, a linear prokaryotic 
DNA of a supercoil non-sensitive promoter, such as /acUV5, tac, XPL(con), XPL, or the like, can be carried out using 
an in vitro transcription/translation system E. coli S30 Extract System for Linear Templates (manufactured by Promega, 
catalogue No. L1030). Examples of the linear prokaryotic DNA used as a template include a DNA fragment, a PCR- 
30 amplified DNA product, a duplicated oligonucleotide ligation, an in vitro transcriptional RNA, a prokaryotic RNA, and 
the like. 

[0264] In addition to the production of the polypeptide according to the present invention, synthesis of a radioactive 
labeled protein, confirmation of the expression capability of a cloned gene, analysis of the function of transcriptional 
reaction or translation reaction, and the like can be carried out using this system. 
35 [0265] The polypeptide produced by the transformant of the present invention can be isolated and purified using the 
general method for isolating and purifying an enzyme. For example, when the polypeptide of the present invention is 
expressed as a soluble product in the host cells, the cells are collected by centrifugation after cultivation, suspended 
in an aqueous buffer, and disrupted using an ultrasonicator, a French press, a Manton Gaulin homogenizer, a Dynomill, 
or the like to obtain a cell-free extract. From the supernatant obtained by centrifuging the cell-free extract, a purified 
40 product can be obtained by the general method used for isolating and purifying an enzyme, for example, solvent ex- 
traction, salting out using ammonium sulfate or the like, desalting, precipitation using an organic solvent, anion ex- 
change chromatography using a resin, such as diethylaminoethyl (DEAE)-Sepharose, DIAION HPA-75 (manufactured 
by Mitsubishi Chemical) or the like, cation exchange chromatography using a resin, such as S-Sepharose FF (manu- 
factured by Pharmacia) or the like, hydrophobic chromatography using a resin, such as butyl sepharose, phenyl sepha- 
rose or the like, gel filtration using a molecular sieve, affinity chromatography, chromatofocusing, or electrophoresis, 
such as isoelectronic focusing or the like, alone or in combination thereof. 
[0266] When the polypeptide is expressed as an insoluble product in the host cells, the cells are collected in the 
same manner, disrupted and centrifuged to recover the insoluble product of the polypeptide as the precipitate fraction. 
Next, the insoluble product of the polypeptide is solubilized with a protein denaturing agent. The solubilized solution 
50 is diluted or dialyzed to lower the concentration of the protein denaturing agent in the solution. Thus, the normal con- 
figuration of the polypeptide is reconstituted. After the procedure, a purified product of the polypeptide can be obtained 
by a purification/isolation method similar to the above. 

[0267] When the polypeptide of the present invention or its derivative (for example, a polypeptide formed by adding 
a sugar chain thereto) is secreted out of cells, the polypeptide or its derivative can be collected in the culture supernatant. 
Namely, the culture supernatant is obtained by treating the culture medium in a treatment similar to the above (for 
example, centrifugation). Then, a purified product can be obtained from the culture medium using a purification/isolation 
method similar to the above. 

[0268] The polypeptide obtained by the above method is within the scope of the polypeptide of the present invention, 
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and examples include a polypeptide encoded by a polynucleotide comprising the nucleotide sequence selected from 
SEQ ID NOS:2 to 3431, and a polypeptide comprising an amino acid sequence represented by any one of SEQ ID 
NOS:3502 to 6931. 

[0269] Furthermore, a polypeptide comprising an amino acid sequence in which at least one amino acids is deleted, 
replaced, inserted or added in the amino acid sequence of the polypeptide and having substantially the same activity 
as that of the polypeptide is included in the scope of the present invention. The term "substantially the same activity 
as that of the polypeptide" means the same activity represented by the inherent function, enzyme activity or the like 
possessed by the polypeptide which has not been deleted, replaced, inserted or added. The polypeptide can be ob- 
tained using a method for introducing part-specific mutation(s) described in, for example, Molecular Cloning, 2nd ed., 
Current Protocols in Molecular Biology, Nuc. Acids. Res., 10: 6487 (1982), Proc. Natl. Acad. Sci. USA, 79: 6409 (1982), 
Gene, 34: 315 (1985), Nuc. Acids. Res., 13: 4431 (1985), Proc. Natl. Acad. Sci. USA, 82: 488 (1985) and the like. For 
example, the polypeptide can be obtained by introducing mutation(s) to DNA encoding a polypeptide having the amino 
acid sequence represented by any one of SEQ ID NOS:3502 to 6931 . The number of the amino acids which are deleted, 
replaced, inserted or added is not particularly limited; however, it is usually 1 to the order of tens, preferably 1 to 20, 
more preferably 1 to 10, and most preferably 1 to 5, amino acids. 

[0270] The at least one amino acid deletion, replacement, insertion or addition in the amino acid sequence of the 
polypeptide of the present invention is used herein to refer to that at least one amino acid is deleted, replaced, inserted 
or added to at one or plural positions in the amino acid sequence. The deletion, replacement, insertion or addition may 
be caused in the same amino acid sequence simultaneously. Also, the amino acid residue replaced, inserted or added 
can be natural or non-natural. Examples of the natural amino acid residue include L-alanine, L-asparagine, L-asparatic 
acid, L-glutamine, L-glutamicacid, glycine, L-histidine, L-isoleucine, L-leucine, L-lysine, L-methionine, L-phenylalanine, 
-L-pro!ine, L-serine r i^threGnine r ^t^ 

[0271] Herein, examples of amino acid residues which are replaced with each other are shown below. The amino 
acid residues in the same group can be replaced with each other. 

Group A: 

[0272] leucine, isoleucine, norleucine, valine, norvaline, alanine, 2-aminobutanoic acid, methionine, O-methylserine, 
t-butylglycine, t-butylalanine, cyclohexylalanine; 

Group B: 

[0273] asparatic acid, glutamic acid, isoasparatic acid, isoglutamic acid, 2-aminoadipic acid, 2-aminosuberic acid; 
Group C: 

[0274] asparagine, glutamine; 
Group D: 

[0275] lysine, arginine, ornithine, 2,4-diaminobutanoic acid, 2,3-diaminopropionic acid; 
Group E: 

[0276] proline, 3-hydroxyproline, 4-hydroxyproline; 
Group F: 

[0277] serine, threonine, homoserine; 
Group G: 

[0278] phenylalanine, tyrosine. 

[0279] Also, in order that the resulting mutant polypeptide has substantially the same activity as that of the polypeptide 
which has not been mutated, it is preferred that the mutant polypeptide has a homology of 60% or more, preferably 
80% or more, and particularly preferably 95% or more, with the polypeptide which has not been mutated, when calcu- 
lated, for example, using default (initial setting) parameters by a homology searching software, such as BLAST, FASTA, 
or the like. 
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[0280] Also, the polypeptide of the present invention can be produced by a chemical synthesis method, such as 
Fmoc (fluorenylmethyloxycarbonyl) method, tBoc (t-butyloxycarbonyl) method, or the like. It can also be synthesized 
using a peptide synthesizer manufactured by Advanced ChemTech, Perkin-Elmer, Pharmacia, Protein Technology 
Instrument, Synthecell-Vega, PerSeptive, Shimadzu Corporation, or the like. 

[0281 ] The transformant of the present invention can be used for objects other than the production of the polypeptide 
of the present invention. 

[0282] Specifically, at least one component selected from an amino acid, a nucleic acid, a vitamin, a saccharide, an 
organic acid, and analogues thereof can be produced by culturing the transformant containing the polynucleotide or 
recombinant vector of the present invention in a medium to produce and accumulate at least one component selected 
from amino acids, nucleic acids, vitamins, saccharides, organic acids, and analogues thereof, and recovering the same 
from the medium. 

[0283] The biosynthesis pathways, decomposition pathways and regulatory mechanisms of physiologically active 
substances such as amino acids, nucleic acids, vitamins, saccharides, organic acids and analogues thereof differ from 
organism to organism. The productivity of such a physiologically active substance can be improved using these differ- 
ences, specifically by introducing a heterogeneous gene relating to the biosynthesis thereof. For example, the content 
of lysine, which is one of the essential amino acids, in a plant seed was improved by introducing a synthase gene 
derived from a bacterium (WO 93/19190). Also, arginine is excessively produced in a culture by introducing an arginine 
synthase gene derived from Escherichia coti (Japanese Examined Patent Publication 23750/93). 
[0284] To produce such a physiologically active substance, the transformant according to the present invention can 
be cultured by the same method as employed in culturing the transformant for producing the polypeptide of the present 
invention as described above. Aiso, the physiologically active substance can be recovered from the culture medium 
in^onr^najion with,_fgr exajiipjejhejon.exchange.resm^ 

[0285] Examples of methods known to one of ordinary skill in the art include electroporation, calcium transfection, 
the protoplast method, the method using a phage, and the like, when the host is a bacterium; and microinjection, 
calcium phosphate transfection, the positively charged lipid-mediated method and the method using a virus, and the 
like, when the host is a eukaryote (Molecular Cloning, 2nd ed.; Spector et ai, Cells/a laboratory manual, Cold Spring 
Harbour Laboratory Press, 1998)). Examples of the host include prokaryotes, lower eukaryotes (for example, yeasts), 
higher eukaryotes (for example, mammals), and cells isolated therefrom. As the state of a recombinant polynucleotide 
fragment present in the host cells, it can be integrated into the chromosome of the host. Alternatively, it can be integrated 
into a factor (for example, a plasmid) having an independent replication unit outside the chromosome. These trans- 
formants are usable in producing the polypeptides of the present invention encoded by the ORF of the genome of 
Corynebactehum glutamicum, the polynucleotides of the present invention and fragments thereof. Alternatively, they 
can be used in producing arbitrary polypeptides under the regulation by an EMF of the present invention. 

11. Preparation of antibody recognizing the polypeptide of the present invention 

[0286] An antibody which recognizes the polypeptide of the present invention, such as a polyclonal antibody, a mon- 
oclonal antibody, or the like, can be produced using, as an antigen, a purified product of the polypeptide of the present 
invention or a partial fragment polypeptide of the polypeptide or a peptide having a partial amino acid sequence of the 
polypeptide of the present invention. 

(1) Production of polyclonal antibody 

[0287] A polyclonal antibody can be produced using, as an antigen, a purified product of the polypeptide of the 
present invention, a partial fragment polypeptide of the polypeptide, or a peptide having a partial amino acid sequence 
of the polypeptide of the present invention, and immunizing an animal with the same. 

[0288] Examples of the animal to be immunized include rabbits, goats, rats, mice, hamsters, chickens and the like. 
[0289] A dosage of the antigen is preferably 50 to 100 u,g per animal. 

[0290] When the peptide is used as the antigen, it is preferably a peptide covalently bonded to a carrier protein, such 
as keyhole limpet haemocyanin, bovine thyroglobulin, or the like. The peptide used as the antigen can be synthesized 
by a peptide synthesizer. 

[0291] The administration of the antigen is, for example, carried out 3 to 10 times at the intervals of 1 or 2 weeks 
after the first administration. On the 3rd to 7th day after each administration, a blood sample is collected from the 
venous plexus of the eyeground, and it is confirmed that the serum reacts with the antigen by the enzyme immunoassay 
(Enzyme-linked Immunosorbent Assay (ELISA), Igaku Shoin (1 976) ; Antibodies • A Laboratory Manual, Cold Spring 
Harbor Laboratory (1988)) or the like. 

[0292] Serum is obtained from the immunized non-human mammal with a sufficient antibody titer against the antigen 
used for the immunization, and the serum is isolated and purified to obtain a polyclonal antibody. 
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[0293] Examples of the method for the isolation and purification include centrifugation, salting out by 40-50% satu- 
rated ammonium sulfate, caprylic acid precipitation (Antibodies, A Laboratory manual, Cold Spring Harbor Laboratory 
(1988)), or chromatography using a DEAE-Sepharose column, an anion exchange column, a protein A- or G-column, 
a gel filtration column, and the like, alone or in combination thereof, by methods known to those of ordinary skill in the art. 

(2) Production of monoclonal antibody 

(a) Preparation of antibody-producing cell 

[0294] A rat having a serum showing an enough antibody titer against a partial fragment polypeptide of the polypep- 
tide of the present invention used for immunization is used as a supply source of an antibody-producing cell. 
[0295] On the 3rd to 7th day after the antigen substance is finally administered the rat showing the antibody titer, the 
spleen is excised. 

[0296] The spleen is cut to pieces in MEM medium (manufactured by Nissui Pharmaceutical), loosened using a pair 
of forceps, followed by centrifugation at 1,200 rpm for 5 minutes; and the resulting supernatant is discarded. 
[0297] The spleen in the precipitated fraction is treated with a Tris-ammonium chloride buffer (pH 7.65) for 1 to 2 
minutes to eliminate erythrocytes and washed three times with MEM medium, and the resulting spleen celts are used 
as antibody-producing cells. 

(b) Preparation of myeloma cells 

.[0298] As.myelomaxells 5 „an.established-ce!L!ine^obta!ned-from~mouse-Gr-rat-is-used. Examples of useful ceiriines 
include those derived from a mouse, such as P3-X63Ag8-U1 (hereinafter referred to as "P3-U1") (Curr. Topics in Micro- 
biol Immunol., 81: 1 (1978); Europ. J. Immunol., 6: 511 (1976)); SP2/0-Agl4 (SP-2) (Nature, 276: 269 (1978)): 
-P3-X63-Ag8653 (653) (J. Immunol., 123:1548 (1979)); P3-X63-Ag8 (X63) ceil line (Nature, 256: 495 (1975)), and the 
like, which are 8-azaguanine-resistant mouse (BALB/c) myeloma cell lines. These cell lines are subcultured in 8-aza- 
guanine medium (medium in which, to a medium obtained by adding 1.5 mmol/l glutamine, 5x10" 5 mol/l 2-mercap- 
toethanol, 10 u.g/ml gentamicin and 10% fetal calf serum (FCS) (manufactured by CSL) to RPMI-1640 medium (here- 
inafter referred to as the "normal medium"), 8-azaguanine is further added at 15 u,g/ml) and cultured in the normal 
medium 3 or 4 days before cell fusion, and 2X10 7 or more of the cells are used for the fusion. 

(c) Production of hybridoma 

[0299] The antibody-producing cells obtained in (a) and the myeloma cells obtained in (b) are washed with MEM 
medium or PBS (disodium hydrogen phosphate: 1.83 g, sodium dihydrogen phosphate: 0.21 g, sodium chloride: 7.65 
g, distilled water: 1 liter, pH: 7.2) and mixed to give a ratio of antibody-producing cells : myeloma ceils = 5 : 1 to 10 : 
1, followed by centrifugation at 1,200 rpm for 5 minutes, and the supernatant is discarded. 

[0300] The cells in the resulting precipitated fraction were thoroughly loosened, 0.2 to 1 ml of a mixed solution of 2 
g of polyethylene glycol-1000 (PEG-1000), 2 ml of MEM medium and 0.7 ml of dimethylsulfoxide (DMSO) per 10 8 
antibody-producing cells is added to the cells under stirring at 37°C, and then 1 to 2 ml of MEM medium is further 
added thereto several times at 1 to 2 minute intervals. 

[0301] After the addition, MEM medium is added to give a total amount of 50 ml. The resulting prepared solution is 
centrifuged at 900 rpm for 5 minutes, and then the supernatant is discarded. The cells in the resulting precipitated 
fraction were gently loosened and then gently suspended in 100 ml of HAT medium (the normal medium to which 10* 4 
mol/l hypoxanthine, 1.5X10 5 mol/l thymidine and 4x10~ 7 mol/l aminopterin have been added) by repeated drawing 
up into and discharging from a measuring pipette. 

[0302] The suspension is poured into a 96 well culture plate at 100 ul/well and cultured at 37°C for 7 to 14 days in 
a 5% C0 2 incubator. 

[0303] After culturing, a part of the culture supernatant is recovered, and a hybridoma which specifically reacts with 
a partial fragment polypeptide of the polypeptide of the present invention is selected according to the enzyme immu- 
noassay described in Antibodies, A Laboratory manual, Cold Spring Harbor Laboratory, Chapter 14 (1998) and the like. 
[0304] A specific example of the enzyme immunoassay is described below. 

[0305] The partial fragment polypeptide of the polypeptide of the present invention used as the antigen in the immu- 
nization is spread on a suitable plate, is allowed to react with a hybridoma culturing supernatant or a purified antibody 
obtained in (d) described below as a first antibody, and is further allowed to react with an anti-rat or anti-mouse immu- 
noglobulin antibody labeled with an enzyme, a chemical luminous substance, a radioactive substance or the like as a 
second antibody for reaction suitable for the labeled substance. A hybridoma which specifically reacts with the polypep- 
tide of the present invention is selected as a hybridoma capable of producing a monoclonal antibody of the present 
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invention. 

[0306] Cloning is repeated using the hybridoma twice by limiting dilution analysis (HT medium (a medium in which 
aminopterin has been removed from HAT medium) is firstly used, and the normal medium is secondly used), and a 
hybridoma which is stable and contains a sufficient amount of antibody titer is selected as a hybridoma capable of 
producing a monoclonal antibody of the present invention. 

(d) Preparation of monoclonal antibody 

[0307] The monoclonal antibody-producing hybridoma cells obtained in (c) are injected intraperitoneally into 8- to 
10-week-old mice or nude mice treated with pristane (intraperitoneal administration of 0.5 ml of 2,6, 10,1 4-tetrameth- 
ylpentadecane (pristane), followed by 2 weeks of feeding) at 5X10 6 to 20x1 0 6 cells/animal. The hybridoma causes 
ascites tumor in 10 to 21 days. 

[0308] The ascitic fluid is collected from the mice or nude mice, and centrifuged to remove solid contents at 3000 
rpm for 5 minutes. 

[0309] A monoclonal antibody can be purified and isolated from the resulting supernatant according to the method 
similar to that used in the polyclonal antibody. 

[031 0] The subclass of the antibody can be determined using a mouse monoclonal antibody typing kit or a rat mon- 
oclonal antibody typing kit. The polypeptide amount can be determined by the Lowry method or by calculation based 
on the absorbance at 280 nm. 

[031 1] The antibody obtained in the above is within the scope of the antibody of the present invention. 
[0312] The antibody can be used for the general assay using an antibody, such as a radioactive material labeled 
immunoassay (RIA), ^m^titiyjejjn^ding^assay, a n immunotis suexhemical.staining,method-(ABG •rasthod-,-G-SA -m ifith- 
bdrefcT); imrnTinoprecipitation, Western blotting, ELISA assay, and the like (An introduction to Radioimmunoassay and 
Related Techniques, Elsevier Science (1986); Techniques in Immunocytochemistry, Academic Press, Vol. 1 (1982), 
Vol. 2 (1983).& Vol. 3 (.1985); Practice and Theory of Enzyme Immunoassays; Elsevier Science"(1985); Enzyme-linked 
Immunosorbent Assay (ELISA), Igaku Shoin (1976) ; Antibodies -A Laboratory Manual, Cold Spring Harbor laboratory 
( 1 988); Monoclonal Antibody Experiment Manual, Koda nsha Scientific ( 1 987) ; Second Series Biochemical Experiment 
Course, Vol. 5, Immunobiochemistry Research Method, Tokyo Kagaku Dojin (1986)). 
[0313] The antibody of the present invention can be used as it is or after being labeled with a label. 
[0314] Examples of the label include radioisotope, an affinity label (e.g., biotin, avidin, or the like), an enzyme label 
(e.g., horseradish peroxidase, alkaline phosphatase, or the like), a fluorescence label (e.g., FITC, rhodamine, or the 
like), a label using a rhodamine atom, (J. Histochem. Cytochem., 18: 315 (1970); Meth. Enzym., 62: 308 (1979); Im- 
munol., 109: 129 (1972); J. Immunol., Meth., 13: 215 (1979)), and the like. 

[0315] Expression of the polypeptide of the present invention, fluctuation of the expression, the presence or absence 
of structural change of the polypeptide, and the presence or absence in an organism other than coryneform bacteria 
of a polypeptide corresponding to the polypeptide can be analyzed using the antibody or the labeled antibody by the 
above assay, or a polypeptide array or proteome analysis described below. 

[0316] Furthermore, the polypeptide recognized by the antibody can be purified by immunoaffinity chromatography 
using the antibody of the present invention. 

12. Production and use of polypeptide array 

(1) Production of polypeptide array 

[0317] A polypeptide array can be produced using the polypeptide of the present invention obtained in the above 
item 10 or the antibody of the present invention obtained in the above item 11. 

[0318] The polypeptide array of the present invention includes protein chips, and comprises a solid support and the 
polypeptide or antibody of the present invention adhered to the surface of the solid support. 

[0319] Examples of the solid support include plastic such as polycarbonate or the like; an acrylic resin, such as 
polyacrylamide or the like; complex carbohydrates, such as agarose, sepharose, or the like; silica; a silica-based ma- 
terial, carbon, a metal, inorganic glass, latex beads, and the like. 

[0320] The polypeptides or antibodies according to the present invention can be adhered to the surface of the solid 
support according to the method described in Biotechniques, 27: 1258-61 (1999); Molecular Medicine Today, 5: 326-7 
(1999); Handbook of Experimental Immunology, 4th edition, Blackwell Scientific Publications, Chapter 10 (1986); Meth. 
Enzym., 34 (1974); Advances in Experimental Medicine and Biology, 42 (1974); U.S. Patent 4,681,870; U.S. Patent 
4,282,287; U.S. Patent 4,762,881 , or the like. 

[0321] The analysis described herein can be efficiently performed by adhering the polypeptide or antibody of the 
present invention to the solid support at a high density, though a high fixation density is not always necessary. 
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(2) Use of polypeptide array 

[0322] A polypeptide or a compound capable of binding to and interacting with the polypeptides of the present in- 
vention adhered to the array can be identified using the polypeptide array to which the polypeptides of the present 
invention have been adhered thereto as described in the above (1). 

[0323] Specifically, a polypeptide or a compound capable of binding to and interacting with the polypeptides of the 
present invention can be identified by subjecting the polypeptides of the present invention to the following steps (i) to (iv): 

(i) preparing a polypeptide array having the polypeptide of the present invention adhered thereto by the method 
of the above (1); 

(ii) incubating the polypeptide immobilized on the polypeptide array together with at least one of a second polypep- 
tide or compound; 

(iii) detecting any complex formed between the at least one of a second polypeptide or compound and the polypep- 
tide immobilized on the array using, for example, a label bound to the at least one of a second polypeptide or 
compound, or a secondary label which specifically binds to the complex or to a component of the complex after 
unbound material has been removed; and 

(iv) analyzing the detection data. 

[0324] Specific examples of the polypeptide array to which the polypeptide of the present invention has been adhered 
include a polypeptide array containing a solid support to which at least one of a polypeptide containing an amino acid 
sequence selected from SEQ ID NOS:3502 to 7001, a polypeptide containing an amino acid sequence in which at 
least one amino acids is deleted, replaced, inserted or added intheamin o acid seq uence_of_the.po!ypept!de and having 
substantialiy-the-sarne activity a~s"trTat"of "the polypeptide, a polypeptide containing an amino acid sequence having a 
homology of 60% or more with the amino acid sequences of the polypeptide and having substantially the same activity 
as that of the polypeptides, a partial fragmentpolypeptide, and a peptide comprising an amino acid sequence^ of a part 
of a polypeptide. 

[0325] The amount of production of a polypeptide derived from coryneform bacteria can be analyzed using a polypep- 
tide array to which the antibody of the present invention has been adhered in the above (1). 

[0326] Specifically, the expression amount of a gene derived from a mutant of coryneform bacteria can be analyzed 
by subjecting the gene to the following steps (i) to (iv): 

(i) preparing a polypeptide array by the method of the above (1); 

(ii) incubating the polypeptide array (the first antibody) together with a polypeptide derived from a mutant of co- 
ryneform bacteria; 

(iii) detecting the polypeptide bound to the polypeptide immobilized on the array using a labeled second antibody 
of the present invention; and 

(iv) analyzing the detection data. 

[0327] Specific examples of the polypeptide array to which the antibody of the present invention is adhered include 
a polypeptide array comprising a solid support to which at least one of an antibody which recognizes a polypeptide 
comprising an amino acid sequence selected from SEQ ID NOS:3502 to 7001, a polypeptide comprising an amino 
acid sequence in which at least one amino acids is deleted, replaced, inserted or added in the amino acid sequence 
of the polypeptide and having substantially the same activity as that of the polypeptide, a polypeptide comprising an 
amino acid sequence having a homology of 60% or more with the amino acid sequences of the polypeptide and having 
substantially the same activity as that of the polypeptides, a partial fragment polypeptide, or a peptide comprising an 
amino acid sequence of a part of a polypeptide. 

[0328] A fluctuation in an expression amount of a specific polypeptide can be monitored using a polypeptide obtained 
in the time course of culture as the polypeptide derived from coryneform bacteria. The culturing conditions can be 
optimized by analyzing the fluctuation. 

[0329] When a polypeptide derived from a mutant of coryneform bacteria is used, a mutated polypeptide can be 
detected. 

13. Identification of useful mutation in mutant by proteome analysis 

[0330] Usually, the proteome is used herein to refer to a method wherein a polypeptide is separated by twodimen- 
sional electrophoresis and the separated polypeptide is digested with an enzyme, followed by identification of the 
polypeptide using a mass spectrometer (MS) and searching a data base. 

[0331] The two dimensional electrophoresis means an electrophoretic method which is performed by combining two 
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electrophoretic procedures having different principles. For example, polypeptides are separated depending on molec- 
ular weight in the primary electrophoresis. Next, the gel is rotated by 90° or 180° and the secondary electrophoresis 
is carried out depending on isoelectric point. Thus, various separation patterns can be achieved (JIS K 3600 2474). 
[0332] In searching the data base, the amino acid sequence information of the polypeptides of the present invention 
and the recording medium of the present invention provide for in the above items 2 and 8 can be used. 
[0333] The proteome analysis of a coryneform bacterium and its mutant makes it possible to identify a polypeptide 
showing a fluctuation therebetween. 

[0334] The proteome analysis of a wild type strain of coryneform bacteria and a production strain showing an im- 
proved productivity of a target product makes it possible to efficiently identify a mutation protein which is useful in 
breeding for improving the productivity of a target product or a protein of which expression amount is fluctuated. 
[0335] Specifically, a wild type strain of coryneform bacteria and a lysine-producing strain thereof are each subjected 
to the proteome analysis. Then, a spot increased in the lysine-producing strain, compared with the wild type strain, is 
found and a data base is searched so that a polypeptide showing an increase in yield in accordance with an increase 
in the lysine productivity can be identified. For example, as a result of the proteome analysis on a wild type strain and 
a lysine-producing strain, the productivity of the catalase having the amino acid sequence represented by SEQ ID NO: 
3785 is increased in the lysine-producing mutant. 

[0336] As a result that a protein having a high expression level is identified by proteome analysis using the nucleotide 
sequence information and the amino acid sequence information, of the genome of the coryneform bacteria of the 
present invention, and a recording medium storing the sequences, the nucleotide sequence of the gene encoding this 
protein and the nucleotide sequence in the upstream thereof can be searched at the same time, and thus, a nucleotide 
sequence having a high expression promoter can be efficiently selected. 
[ 0337 ] lnjyTe_p^)te^rrLe„analy^ 

derived from a modified protein. However, the modified protein can be efficiently identified using the recording medium 
storing the nucleotide sequence information, the amino acid sequence information, of the genome of coryneform bac- 
teria r and the recording medium storing the sequences, according to the present invention. — 
[0338] Moreover, a useful mutation point in a useful mutant can be easily specified by searching a nucleotide se- 
quence (nucleotide sequence of promoters, ORF, or the like) relating to the thus identified protein using a recording 
medium storing the nucleotide sequence information and the amino acid sequence information, of the genome of 
coryneform bacteria of the present invention, and a recording medium storing the sequences and using a primer de- 
signed on the basis of the detected nucleotide sequence. As a result that the useful mutation point is specified, an 
industrially useful mutant having the useful mutation or other useful mutation derived therefrom can be easily bred. 
[0339] The present invention will be explained in detail below based on Examples. However, the present invention 
is not limited thereto. 

Example 1 

Determination of the full nucleotide sequence of genome of Corynebacterium glutamicum 

[0340] The full nucleotide sequence of the genome of Corynebacterium glutamicum was determined based on the 
whole genome shotgun method (Science, 269: 496-512 (1995)). In this method, a genome library was prepared and 
the terminal sequences were determined at random. Subsequently, these sequences were ligated on a computer to 
cover the full genome. Specifically, the following procedure was carried out. 

(1) Preparation of genome DNA of Corynebacterium glutamicum ATCC 13032 

[0341] Corynebacterium glutamicum ATCC 1 3032 was cultured in BY medium (7 g/l meat extract, 1 0 g/l peptone, 3 
g/l sodium chloride, 5 g/l yeast extract, pH 7.2) containing 1 % of glycine at 30°C overnight and the cells were collected 
by centrifugation. After washing with STE buffer (10.3% sucrose, 25 mmol/l Tris hydrochloride, 25 mmol/i EDTA, pH 
8.0), the cells were suspended in 10 ml of STE buffer containing 10 mg/ml lysozyme, followed by gently shaking at 
37°C for 1 hour. Then, 2 ml of 10% SDS was added thereto to lyse the cells, and the resultant mixture was maintained 
at 65°C for 10 minutes and then cooled to room temperature. Then, 1 0 ml of Tris-neutralized phenol was added thereto, 
followed by gently shaking at room temperature for 30 minutes and centrifugation (15,000 x g, 20 minutes, 20°C). The 
aqueous layer was separated and subjected to extraction with phenol/chloroform and extraction with chloroform (twice) 
in the same manner. To the aqueous layer, 3 mol/l sodium acetate solution (pH 5.2) and isopropanol were added at 
1/10 times volume and twice volume, respectively, followed by gently stirring to precipitate the genome DNA. The 
genome DNA was dissolved again in 3 ml of TE buffer (10 mmol/l Tris hydrochloride, 1 mmol/l EDTA, pH 8.0) containing 
0.02 mg/ml of RNase and maintained at 37°C for 45 minutes. The extractions with phenol, phenol/chloroform and 
chloroform were carried out successively in the same manner as the above. The genome DNA was subjected to iso- 
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propanol precipitation. The thus formed genome DNA precipitate was washed with 70% ethanol three times, followed 
by air-drying, and dissolved in 1 .25 ml of TE buffer to give a genome DNA solution (concentration: 0.1 mg/ml). 

(2) Construction of a shotgun library 

[0342] TE buffer was added to 0.01 mg of the thus prepared genome DNA of Cory nebacte hum glutamicum ATCC 
13032 to give a total volume of 0.4 ml, and the mixture was treated with a sonicator (Yamato Powersonic Model 150) 
at an output of 20 continuously for 5 seconds to obtain fragments of 1 to 10 kb. The genome fragments were blunt- 
ended using a DNA blunting kit (manufactured by Takara Shuzo) and then fractionated by 6% polyacrylamide gel 
electrophoresis. Genome fragments of 1 to 2 kb were cut out from the gel, and 0.3 ml MG elution buffer (0.5 mol/l 
ammonium acetate, 10 mmol/l magnesium acetate, 1 mmol/l EDTA, 0.1% SDS) was added thereto, followed by shaking 
at 37°C overnight to elute DNA. The DNA eluate was treated with phenol/chloroform, and then precipitated with ethanol 
to obtain a genome library insert. The total insert and 500 ng of pUC18 SmalfBAP (manufactured by Amersham Phar- 
macia Biotech) were ligated at 16°C for 40 hours. 

[0343] The ligation product was precipitated with ethanol and dissolved in 0.01 ml of TE buffer. The ligation solution 
(0.001 ml) was introduced into 0.04 ml of E. coli ELECTRO MAX DH10B (manufactured by Life Technologies) by the 
electroporation under conditions according to the manufacture's instructions. The mixture was spread on LB plate 
medium (LB medium (10 g/l bactotrypton, 5 g/l yeast extract, 10 g/l sodium chloride, pH 7.0) containing 1.6% of agar) 
containing 0.1 mg/ml ampicillin, 0.1 mg/ml X-gal and 1 mmol/l isopropyl-p-D-thiogalactopyranoside (IPTG) and cultured 
at 37°C overnight. 

[0344] The transformant obtained from colonies formed on the plate medium was stationary cultured in a 96-well 
ti ter plat e having 0.05 ml^fLB medium co ntainin g 0.1 mq/mLampicillin.at37°C.overnight._Then, 0.05 m! of LB-medium 
containing 20% glycerol was added thereto, followed by stirring to obtain a glycerol stock. 

(3) Construction of cosmid library - - = - "~ 

[0345] About 0.1 mg of the genome DNA of Oorynebacterium glutamicum ATCC 13032 was partially digested with 
Sau3A\ (manufactured by Takara Shuzo) and then ultracentrifuged (26,000 rpm, 18 hours, 20°C) under 10 to 40% 
sucrose density gradient obtained using 10% and 40% sucrose buffers (1 mol/l NaCI, 20 mmol/l Tris hydrochloride, 5 
mmol/l EDTA, 10% or 40% sucrose, pH 8.0). After.the centrifugation, the solution thus separated was fractionated into 
tubes at 1 ml in each tube. After confirming the DNA fragment length of each fraction by agarose gel electrophoresis, 
a fraction containing a large amount of DNA fragment of about 40 kb was precipitated with ethanol. 
[0346] The DNA fragment was ligated to the BamHI site of superCosI (manufactured by Stratagene) in accordance 
with the manufacture's instructions. The ligation product was incorporated into Escherichia coli XL-1-BlueMR strain 
(manufactured by Stratagene) using Gigapack III Gold Packaging Extract (manufactured by Stratagene) in accordance 
with the manufacture's instructions. The Escherichia coli was spread on LB plate medium containing 0.1 mg/ml amp- 
icillin and cultured therein at 37°C overnight to isolate colonies. The resulting colonies were stationarily cultured at 
37°C overnight in a 96-well titer plate containing 0.05 ml of the LB medium containing 0.1 mg/ml ampicillin in each 
well. LB medium containing 20% glycerol (0.05 ml) was added thereto, followed by stirring to obtain a glycerol stock. 

(4) Determination of nucleotide sequence 
(4-1 ) Preparation of template 

[0347] The full nucleotide sequence of Corynebacterium glutamicum ATCC 13032 was determined mainly based on 
the whole genome shotgun method. The template used in the whole genome shotgun method was prepared by the 
PCR method using the library prepared in the above (2). 

[0348] Specifically, the clone derived from the whole genome shotgun library was inoculated using a replicator (man- 
ufactured by GENETIX) into each well of a 96-well plate containing the LB medium containing 0.1 mg/ml of ampicillin 
at 0.08 ml per each well and then stationarily cultured at 37°C overnight. 

[0349] Next, the culturing solution was transported using a copy plate (manufactured by Tokken) into a 96-well re- 
action plate (manufactured by PE Biosystems) containing a PCR reaction solution (TaKaRa Ex Taq (manufactured by 
Takara Shuzo)) at 0.08 ml per each well. Then, PCR was carried out in accordance with the protocol by Makino et a/. 
(DNA Research, 5: 1-9 (1998)) using GeneAmp PCR System 9700 (manufactured by PE Biosystems) to amplify the 
inserted fragment. 

[0350] The excessive primers and nucleotides were eliminated using a kit for purifying a PCR production (manufac- 
tured by Amersham Pharmacia Biotech) and the residue was used as the template in the sequencing reaction. 
[0351] Some nucleotide sequences were determined using a double-stranded DNA plasmid as a template. 
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[0352] The double-stranded DNA plasmid as the template was obtained by the following method. 
[0353] The clone derived from the whole genome shotgun library was inoculated into a 24- or 96-well plate containing 
a 2x YT medium ( 1 6 g/l bactotrypton, 1 0 g/l yeast extract, 5 g/l sodium chloride, pH 7.0) containing 0.05 mg/ml ampicillin 
at 1.5 ml per each well and then cultured under shaking at 37°C overnight. 

[0354] The double-stranded DNA plasmid was prepared from the cuituring solution using an automatic plasmid pre- 
paring machine, KURABO PI-50 (manufactured by Kurabo Industries) or a multiscreen (manufactured by Millipore) in 
accordance with the protocol provided by the manufacturer. 

[0355] To purify the double-stranded DNA plasmid using the multiscreen, Biomek 2000 (manufactured by Beckman 
Coulter) or the like was employed. 

[0356] The thus obtained double-stranded DNA plasmid was dissolved in water to give a concentration of about 0.1 
mg/ml and used as the template in sequencing. 

(4-2) Sequencing reaction 

[0357] To 6 u.l of a solution of ABI PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit (manufactured 
by PE Biosystems), an M13 regular direction primer (M13-21) or an M13 reverse direction primer (M13REV) (DNA 
Research, 5: 1-9 (1998) and the template prepared in the above (4-1) (the PCR product or the plasmid) were added 
to give 10 uJ of a sequencing reaction solution. The primers and the templates were used in an amount of 1.6 pmol 
and an amount of 50 to 200 ng t respectively. 

[0358] Dye terminator sequencing reaction of 45 cycles was carried out with GeneAmp PCR System 9700 (manu- 
factured by PE Biosystems) using the reaction solution. The cycle parameter was determined in accordance with the 
manufacturer's instruction awxjn^ajnyjnc^ The 
sample^was purified using Multiscreen HV plate (manufactured by Millipore) according to the manufacture's instruc- 
tions. The thus purified reaction product was precipitated with ethanol, followed by drying, and then stored in the dark 
at -30°C. _ , 

[0359] The dry reaction product was analyzed by ABI PRISM 377 DNA Sequencer and ABI PRISM 3700 DNA An- 
alyzer (both manufactured by PE Biosystems) each in accordance with the manufacture's instructions. 
[0360] The data of about 50,000 sequences in total (i.e., about 42,000 sequences obtained using 377 DNA Sequenc- 
er and about 8,000 reactions obtained by 3700 DNA Analyser) were transferred to a server (Alpha Server 4100: man- 
ufactured by COMPAQ) and stored. The data of these about 50,000 sequences corresponded to 6 times as much as 
the genome size. 

(5) Assembly 

[0361] All operations were carried out on the basis of UNIX platform. The analytical data were output in Macintosh 
platform using X Window System. The base call was carried out using phred (The University of Washington). The 
vector sequence data was deleted using SPS Cross.Match (manufactured by Southwest Parallel Software). The as- 
sembly was carried out using SPS phrap (manufactured by Southwest Parallel Software; a high-speed version of phrap 
(The University of Washington)). The contig obtained by the assembly was analyzed using a graphical editor, consed 
(The University of Washington). A series of the operations from the base call to the assembly were carried out simul- 
taneously using a script phredPhrap attached to consed. 

(6) Determination of nucleotide sequence in gap part 

[0362] Each cosmid in the cosmid library constructed in the above (3) was prepared by a method similar to the 
preparation of the double-stranded DNA plasmid described in the above (4-1). The nucleotide sequence at the end of 
the inserted fragment of the cosmid was determined by using ABI PRISM BigDye Terminator Cycle Sequencing Ready 
Reaction Kit (manufactured by PE Biosystems) according to the manufacture's instructions. 

[0363] About 800 cosmid clones were sequenced at both ends to search a nucleotide sequence in the contig derived 
from the shotgun sequencing obtained in the above (5) coincident with the sequence. Thus, the linkage between re- 
spective cosmid clones and respective contigs were determined and mutual alignment was carried out. Furthermore, 
the results were compared with the physical map of Corynebacterium glutamicum ATCC 13032 (Mol. Gen. Genet, 
252: 255-265 (1996) to carrying out mapping between the cosmids and the contigs. 

[0364] The sequence in the region which was not covered with the contigs was determined by the following method. 
[0365] Clones containing sequences positioned at the ends of contigs were selected. Among these clones, about 
1 ,000 clones wherein only one end of the inserted fragment had been determined were selected and the sequence at 
the opposite end of the inserted fragment was determined. A shotgun library clone or a cosmid clone containing the 
sequences at the respective ends of the inserted fragment in two contigs was identified, the full nucleotide sequence 
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of the inserted fragment of this clone was determined, and thus the nucleotide sequence of the gap part was determined. 
When no shotgun library clone or cosmid clone covering the gap part was available, primers complementary to the 
end sequences at the two contigs were prepared and the DNA fragment in the gap part was amplified by PCR. Then, 
sequencing was performed by the primer walking method using the amplified DNA fragment as a template or by the 
shotgun method in which the sequence of a shotgun clone prepared from the amplified DNA fragment was determined. 
Thus, the nucleotide sequence of the domain was determined. 

[0366] In a region showing a low sequence precision, primers were synthesized using AUTOFINISH function and 
NAVIGATING function of consed (The University of Washington) and the sequence was determined by the primer 
walking method to improve the sequence precision. The thus determined full nucleotide sequence of the genome of 
Corynebacterium glutamicum ATCC 13032 strain is shown in SEQ ID NO:1. 

(7) Identification of ORF and presumption of its function 

[0367] ORFs in the nucleotide sequence represented by SEQ ID NO:1 were identified according to the following 
method. First, the ORF regions were determined using software for identifying ORF, i.e., Glimmer, GeneMark and 
GeneMark.hmm on UNIX platform according to the respective manual attached to the software. 
[0368] Based on the data thus obtained, ORFs in the nucleotide sequence represented by SEQ ID NO'1 were iden- 
tified. 

[0369] The putative function of an ORF was determined by searching the homology of the identified amino acid 
sequence of the ORF against an amino acid database consisting of protein-encoding domains derived from Swiss- 
Prot, PIR or Genpept database constituted by protein encoding domains derived from GenBank database, Frame 
Sear^(manufactujed b^f^m 

against an amino acid database consisting of protein-encoding domains derived from Swiss-Prot, PIR or Genpept 
database constituted by protein encoding domains derived from GenBank database, BLAST The nucleotide sequences 
of the thus determined ORFs are shown in SEQ ID NOS:2 to 3501, and the amino acid sequences encoded byThese 
ORFs are shown in SEQ ID NOS:3502 to 7001. 

[0370] In some cases of the sequence listings in the present invention, nucleotide sequences, such as TTG, TGT, 
GGT, and the like, other than ATG, are read as an initiating codon encoding Met. 

[0371] Also, the preferred nucleotide sequences are SEQ ID NOS:2 to 355 and 357 to 3501 , and the preferred amino 
acid sequences are shown in SEQ ID NOS:3502 to 3855 and 3857 to 7001 

[0372] Table 1 shows the registration numbers in the above-described databases of sequences which were judged 
as having the highest homology with the nucleotide sequences of the ORFs as the results of the homology search in 
the amino acid sequences using the homology-searching software Frame Search (manufactured by Compugen), 
names of the genes of these sequences, the functions of the genes, and the matched length, identities and analogies 
compared with publicly known amino acid translation sequences. Moreover, the corresponding positions were con- 
firmed via the alignment of the nucleotide sequence of an arbitrary ORF with the nucleotide sequence of SEQ ID NO: 
1 . Also, the positions of nucleotide sequences other than the ORFs (for example, ribosomal RNA genes, transfer RNA 
genes, IS sequences, and the like) on the genome were determined. 

[0373] Fig. 1 shows the positions of typical genes of the Corynebacterium glutamicum ATCC 13032 on the genome. 
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45347 


46489 


CM 
O 

co 


48485 


49368 


49601 


50616 


50972 


51436 


53055 


53095 


54080 


56417 


SEQ 

NO. 
(a.a.) 


3541 


CN 
TT 
lO 

CO 


3543 


T 

in 
ro 


3545 


CO 

in 
co 


3547 


3548 


CO 

in 

CO 


3550 


3551 


3552 


3553 


3554 


3555 


3556 


3557 


3558 


3559 


3560 


SEQ 
NO. 
(DNA) 


T 


CN 


CO 


■c 


m 


CO 
NT 


r-- 


CO 


CO 


OS 


m 


CN 

m 


r~> 
in 


^ 

in 


in 
in 


CO 

m 


m 


CD 

in 


cn 
in 


o 

CD 
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Function 


hypothetical protein 


hypothetical protein 


hypothetical protein 




hypothetical protein 






magnesium and cobalt transport 
protein 




chloride channel protein 


required for NMN transport 


phosphate starvation-induced 
protein-like protein 








Mg(2+)/citrate complex secondary 
transporter 


two-component system sensor 
histidine kinase 




transcriptional regulator 


D-isomer specific 2-hydroxyacid 
dehydrogenase 


15 




Matched 
length 
(aa) 


r- 


o> 


CN 

to 




o 

CO 






o 

cn 

CO 




o 
o 

T 


CN 


o 

CO 








r— 
CJ) 


CO 

CO 

m 




O) 
CN 
CN 


CO 
CO 
CN 


20 




Similarity 
(%) 


74.3 


70.4 


83.9 




d 
m 






59.5 




64.8 


53.1 


60.0 








68.8 


60.6 




63.3 


r-- 
co 
r» 






Identity 


in 

~o 

T 


CO 

to 

CO 


OJ 

co 
in 




26)8 






LO 

O) 
CN 




JO. 

d 

CO 


CN 


29.1 








42.3 


Z'LZ 


- 


CO 
CO 


co 


25 
30 
35 


Table 1 (continued) 


Homologous gene ; 


Bacillus subtilis yrkF '' 


Synechocystis sp. PCC6803 „ 
slr1261 


Mycobacterium tuberculosis ,i 
H37RvRv1766 




Leishmania major L4768. 11 j s 






Mycobacterium tuberculosis !, 
H37RvRv1239ccorA 




Zymomonas mobilis ZM4 clcb 


Salmonella typhimurium pnuC 


Mycobacterium tuberculosis 
H37Rv RV2368C 








Bacillus subtilis citM 


Escherichia coli K12 dpiB 




Escherichia coli K12 criR 


Corynebacterium glutamicum , 
unkdh 


40 




db Match 


CO 

o 
< 

CO 

I 

u_ 

bcr 
or 
> 

CL 

cn 


CO 

> 
> 

cn 

CO 

o 
> 

CL 

CO 


pir.G70988 




co' 

CO 

r-* 

TT 

_J 

LU 

i 

cL 
cn 






pir:F70952 




gp:AF179611_12 


sp:PNUC_SALTY 


sp:PHOL_MYCTU 








sp:CITM_BACSU 


_j 
o 
o 

LU 

«' 

CL 

Q 

CL 
CO 




sp:DPIA_ECOLI 


i 

m 

CJ) 
CO 

co 
U- 

< 

CL 

cn 






SI 


cn 

CN 


cn 
in 




m 
m 

CD 


o 

T 

co 




1653 


1119 


r- 


1269 


o 
cn 
to 


1122 


CN 
CO 


CO 
CO 


in 
CD 
r- 


1467 


1653 


o 

LO 


m 

CO 


CN 
O) 


45 




Terminal 
(nt) j 


56386 , 


56680 | 


57651 ! 


58941 


1 59930 


60662 


62321 


! 62390 


63594 


65458 ! 


65508 j 


67972 


68301 


68251 


69824 


68720 


72158 


71474 


72814 


\ 72817 


50 




Initial 
(nt) 


56676 


57270 


57478 


58087 


59091 


59952 


69909 


63508 


64040 


64190 


66197 


66851 


68170 


, 68634 


69060 


70186 


70506 


72043 


72161 


73728 






SEQ 

NO. 
(aa.) 


Is 

1 m 

l CO 


3562 | 


3553: 


3564 


3565 


3566 


3567 


3568 

i 


3569 


3570 


357l1 


| 3572 


3573 


m 

CO 


3575 


3576 


3577 


3578 


3579) 


i 3580 


55 




SEQ 
NO. 
(DNA) 


!s 

i 


I to 

i 


CO 

to 


CO 


m 

CD 


to 
to 


r- 

CO 


CO 
CO 


cn 

CO 


o 




CN 

r- 


CO 

r-» 




in 
r- 


CO 


r-- 


CO 


cn 


i rt 

! § 

! 
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Function 


hypothetical protein 


biotin synthase 


hypothetical protein 


hypothetical protein 




hypothetical protein 


hypothetical protein 


integral membrane efflux protein 


creatinine deaminase 






SIR2 gene family (silent information 
regulator) 


triacylglycerol lipase 


triacylglycerol lipase 




transcriptional regulator 


urease gammma subunit or urease 
structural protein 


urease beta subunit 


urease alpha subunit 


15 




Matched 
length ' 


r— 
CN 


co 
ro 


CO 
T 


LO 

CO 




CM 


co 


r» 
o 

LO 


cn 

CO 






cn 
r- 

CN 


to 

CN 


CN 
CO 
CN 






o 
o 


CN 

to 


O 
h- 
lO 






>> 








































20 




Similari 


76.4 


99.7 


79.1 


63.5 




75.0 


66.0 


59.0 


99.8 






50.2 


59.0 


56.1 




94.7 


100.0 


100.0 


100.0 






Identity 
{%) 


38.6 


cn 
"cn 


72.1 


CO 




71.0 


o 

CD 


CO 
CN 


CN 
CO 






26.2 


r- 

CO 


"cri" 

CN 




90)6 


100.0 


100.0 


100.0 


25 




























CM 
















30 
35 


Table 1 (continued) 


Homologous gene 


Streptomyces coelicolor A3(2) 
SCM2.03 


Corynebacterium glutamicum 
bioB 


Mycobacterium tuberculosis [ 
H37RvRv1590 


Saccharomyces cerevisiae 
YKL084w 




Chlamydia muridarum Nigg 
TC0129 


Chlamydia pneumoniae 


Streptomyces virginiae varS , 


Bacillus sp. 






Saccharomyces cerevisiae hsi 


Propionibacterium acnes 


Propionibacterium acnes 




Corynebacterium glutamicum 
ureR 


Corynebacterium glutamicum 
ureA 


Corynebacterium glutamicum 
ATCC 13032 ureB 


Corynebacterium glutamicum 
ATCC 13032 ureC 


40 




db Match 


gp.SCM2_3 


sp:BIOB_CORGL 


CM 

rr 
m 
o 
r-- 
X 

Q. 


sp:YKI4_YEAST 




CO 

f^- 

CO 
LL 
CC 
CL 


rr 

CO 
LO 
CO 

> 

CL 
CO 

O 


prf:2512333A 


i 

in 
o 

LO 
CO 
CO 

Q 

CL 

co 






sp:HST2_YEAST 

. ... 


prf:2316378A 


prf.2316378A 




LO 
^* 
CO 
CN 
O 
CO 

< 

CL 
CO 


CN| 

to 
cn 

CN 

o 
CD 
< 

CL 
CO 


gp:CGL251883_2 


gp:CGL251883_3 






si 


cn 

CN 


1002 


r-» 

CO 
CM 


cn 

CO 
CO 




xr 


ro 

CN 


1449 


1245 


CO 

o 

CO 


LO 

to 


CN 

cn 


CN 

r-- 
cn 


o 
o 
cn 


CO 

co 

CO 


CO 

to 


O 

o 

CO 


to 

CO 


1710 


45 




Terminal 
(nt) 


74272 


75491 


75742 


76035 


76469 


80613 


81002 


82120 


83691 


85098 


85663 


87241 | 


87561 


88545 


90445 


90461 


91473 


91988 


93701 


50 




Initial 
(nt) 


73844 


74490 


75506 


75697 


76353 


80753 


81274 


83568 


84935 


85403 


86277 


86318 


1 88532 


89444 


i 89558 


90973 


91174 


91503 


91992 






SEQ 
NO. 
(a.a.) 


3581 


3582 


3583 


3584 


3585 


I 

3586 


3587 i 


3588 


3589' 


3590 I 


3591 


3592 


3593 


3594 


3595 


3596 


3597 


3598 


3599 


55 




SEQ 

NO. 
(DNA) 


CO 


ON 
CO 


CO 
CO 


j CO 

! 


LO 

co 


i 

to 

! " 


r-- 
co 


CO 
CO 


Ol 
CO 


o 
cn 




CN 

cn 


CO 

o> 


• 


LO 
O) 


CO 
Ol 


r- 

Ol 


CO 

cn 


cn 

Ol 
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Function 


urease accessory protein 


urease accessory protein 


urease accessory protein 


urease accessory protein 


epoxide hydrolase 




valanimycin resistant protein 






heat shock protein (hsp90-famity) 


AMP nucleosidase 




acetolactate synthase large subunit 




proline dehydrogenase/P5C 
dehydrogenase 




aryl-alcohol dehydrogenase 
(NADP+) 


pump protein (transport) 


indote-3-acetyl-Asp hydrolase 




hypothetical membrane protein 




Matched 
length 
(aa) 


CO 


CO 
CN 
CN 


in 
o 

CN 


CO 
00 
CN 


cn 
r-. 
CN 




CO 






oo 

CO 
CO 


CO 
XT 




CO 

cn 




1297 




CO 
CO 
CO 


ro 

LO 


CN 
LD 
CO 




CO 

o 




Similarity 


100.0 


100.0 


100.0 


100.0 


48.4 




59.7 I 






52.7 | 


68.2 




58.7 




50.4 




60.7 


71.4 


49.2 




70.8 




Identity 
(%) 


100.0 


100.0 


100.0 


100.0 


21.2 




26.5 






CO 
CN 


41.0 




co- 

O) 

CN 




WJ 

in 

CN 




30.2 


36.5 I 


o 

CO 
CN 




cn 
in 

CO 




Homologous gene 


Corynebacterium glutamicum 
ATCC 13032 ureE 


Corynebacterium glutamicum 
ATCC 13032 ureF 


Corynebacterium glutamicum 
ATCC 13032 ureG 


Corynebacterium glutamicum 
ATCC 13032 ureD 


| Agrobaclerium radiobacter echA 




Streptomyces viridifactens vlmF 






Escherichia coli K12 htpG 


Escherichia coli K12 amn 




Aeropyrum pernix K1 APE2509 




Salmonella typhimurium putA 




Phanerochaete chrysosporium 
aad 


Escherichia coli K12 ydaH 


Enterobacter agglomerans 




Escherichia coli K12 yidH 




db Match 


gp:CGL251883_4 


CO 

1 

CO 

oo 

00 

LO 
CN 
— 1 

o 
o 

cL 

ZJt 


gp;CGL251883_6 


gp:CGL251883_7 


prf;2318326B 




|gp:AF148322_1 






_j 
O 

o 

UJ 

o' 

a 
t- 

X 

a. 

t/i 


—) 

O 
o 

UJ 
< 

CL 




CO 
CO 
XT 
CN 

r*. 

UJ 
l: 
'q. 




sp:PUTA_SALTY 




X 

o 
< 

X 
Q_ 

1 

Q 

CL 
co 


—j 
O 

o 

LU 

x' 

< 

CL 
V) 


< 

xr 
CN 
XT 
CN 
CN 
TT 
CN 

o. 




_ j 
o 
o 

LU 

x 1 

Q 
>- 

CL 

to 




u 




CO 
CO 


in 

CO 


CD 
CO 




CD 
CO 
CO 


1152 


LO 
f- 

co 


2775 


1824 


CO 


CD 

r-» 
m 


CN 

m 
m 


O 
CO 
CO 


3456 


XT 


m 

TT 

CD 


1614 


1332 


CO 
CD 

to 


to 

CO 

to 


LO 

ro 


Terminal 
(nt) 


94199 


94879 


95513 


96365 


96368 


98189 


97319 


100493 


98808 


101612 


104909 

I . 1 


105173 


105841 


106630 


110890 


111274 


112318 


114083 


oo 
uo 


114564 


115943 


116263 


Initial 
(nt) 


93729 


94202 


94899 


95517 


97144 


97521 


98470 | 


99819 


101582 


103435 


103494 


| 105751 


106392 


107289 


107435 


111161 


111374 


112470 


114147 


115262 


115578 


115949 | 


SEQ 
NO. 
(a-a.) 

3600 


3601 


3602 


3603 



O 
CO 
CO 


3605 


3606 

1 


3607 


3608 


3609 


| 3610 


13611 


3612 


to 
to 

CO 


3614 


3615 


3616 


3617 


3618 


O) 

CO 
CO 


3620 


3621 | 


SEQ 
NO. 
(DNA) 

100 


- I CN 

o | o 

I " 


CO 

o 


I 1 

T LO 1 lO 
O J o ] O 

Ti" 


O 

i 


CO 

o 


cn o 
o | 

*~ ! "~ 




CN 


CO 


i 


m 


to 




CO 


CD 


o 

CN 


r\j j 



41 



EP1 108 790 A2 



5 
10 


Function 




transcriptional repressor 


methylglyoxalase 


hypothetical protein 


mannitol dehydrogenase 


D-arabinitol transporter 




galactitol utilization operon repressor 


xylulose kinase 




pantoate-beta-alanine ligase 


3-methyl-2-oxobutanoate 
hydroxymethyltransferase 




DNA-3-methyladenine glycosylase 




esterase 




carbonate dehydratase 


xylose operon repressor protein 


macrolide efflux protein 






15 


Matched 
length 
(aa) 




CO 

in 

CN 


CO 
CM 


CM 
CD 


cn 


m 

CO 
T 




o 
to 

CM 


LO 




cn 

CM 


r->. 

CM 




CO 
CO 




o 
r»- 

CM 




o 

CM 


in 

CO 


CO 






20 


Similarity 
(%) 




f^- 
O 

m 


78.6 


64.8 


70.4 


68.3 




64.6 


68.1 




100.0 


100.0 




67.6 




69.3 




53.2 


49.3 


61.2 








Identity 




29.5 


I 5 ; 7 - 9 


o 

r>- 
cn 


m 


30.3 




CO 
CM 


o 




io;o.o 


100.0 




p 

"I'M" 




39,3 




30.9 


CM 


CN 






25 ^ 
QJ 
C 

"E 
o 

CJ 

30 

CD 

35 
40 


Homologous gene 




Agrobacterium tumefaciens 
accR !i 


Bacillus subtilis yurT 


Mycobacterium tuberculosis 
H37Rv Rv1276c 


Pseudomonas fluorescens mtID j 


Klebsiella pneumoniae dalT 




Escherichia coli K12 gatR 


Streptomyces rubiginosus xylB 




Corynebacterium glutamicum 
ATCC 13032 panC 


Corynebacterium glutamicum 
'ATCC 13032 panB 




Arabidopsis thaliana mag 




Petroleum-degrading bacterium 
HD-1 hde 




Methanosarcina thermophila , 


Bacillus subtilis W23 xylR 


Lactococcus lactis mef214 






db Match 




sp:ACCR_AGRTU 


|pir:C70019 


ZD 
h- 
O 
>- 

l 

to 

O 
> 

CL 
V) 


prf: 23091 80A 


prf:2321326A 




sp:GATR_ECOLI 


ZD 

cr 

cc 
h- 
co 

I 

CD 

_j 
> 

X 

CL 
to 




CN 

1 

< 
CL 

o 
o 

CL 

O) 


gp:CGPAN_1 




X 

< 

on 

S 

CO 

bl 

CO 




gp:AB029896_1 




sp:CAH_METTE 


sp:XYLR_BACSU 


gp:LLLPK214J2 








Li_ — 


2052 


o 

CO 

r- 


o 
cn 
co 


o 
in 


1509 


1335; 


cn 

CO 


r- 

CO 
CO 


1419 


CN 
CM 
CO 


r^- 

CO 
CO 


CO 
CO 


m 
cn 


O 

co 
to 


m 
to 


CM 

cn 


CN 

to 


CO 

in 
in 


1143 


1272 


o 

CO 


T 
TT 


45 


Terminal 
(nt) 


| 116548 


118810 


120410 


I 120413 


m 
cn 
o 

CM 


122507 j 


124030 


124966 | 


126350 


127992 I 


126353 


127192 


128099 


129489 


130798 


130815 


"<T 
CM 
"<T 
CM 
CO 


132981 


132971 


o 

CN 
CO 


135518 


136122 


50 


Initial 
(nt) 


118599 


119589 


120021 


120922 


122459 


123841 


123842 


124130 


124932 


127171 


127189 


128004 


129049 


130118 | 


130145 


131738 


131798 


132424 


134113 


135478 


136321 


136565 




SEQ 
NO. 
(a.a.) 


3622 


3623 


3624 


3625 


3626 


3627 


3628 


3629 


3630 


3531 


3632 


3633 


3634 


3635 


3636 


3637 


3638 


3639 


3640 


3641 


3642 


3643 


55 


SEQ 

NO. 


CN 
CN 


co 

CN 


CM 


in 

CM 


to 

CM 


CM 


CO 
CM 


I 129 


o 
co 


CO 


CN 
CO 


CO 
CO 


CO 

. I 


in 

CO 


CO 
CO 


CO 


CO 
CO 


cn 

CO 


o 




CM 


CO 
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55 



sis 

CO CD — ' 
^ — 



CO 



2 

X) 



O — 



co 2 S° 



uj y ^ 

CQ 2 Q 



(D CM 
CO 

cr> o 

co ■ 



o 
a. 



£ 



E ^ 



ra Q 

co >- 



CO 

< 

LU 
>- 

X 



o 
to 



E =5 

=3 2 



E * 

CD "O 

F « 



x 
a: 

J 

Q 
O 
2 



to 
Z 



O 
O 



CD 



co 

ID 



<d I to 
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j Function 


methyltransferase 








ribonuclease 






neprilystn-like metallopeptidase 1 




transcriptional regulator, GntR family 
or fatty acyl-responsive regulator 


fructokinase or carbohydrate kinase 


hypothetical protein 


methylmalonic acid semialdehyde 
dehydrogenase 


myo-inositol catabolism 


myo-inositol catabotism 


rhizopine catabotism protein 


myo-inositol 2-dehydrogenase 


myo-inositol catabolism 


metabolite export pump of 
tetracenomycin C resistance 




oxidoreductase 




15 


Matched 
length 
(a.a) 


o 








CO 






CN 
CN 

r- 




CO 
CO 
CN 


CN 
CO 
CO 


CO 

cn 

CN 


CO 

cn 

T 


CO 

to 

CN 


CO 
CO 
LO 


o 

CD 
CN 


in 

CO 
CO 


CO 
CN 


r- 
tn 

T 




in 

CO 




20 


Similarity 


56.7 








76.3 






57.2 




65.6 


63.0 


80.7 


86.1 


58.2 


69.8 


51.0 


72.2 


72.1 


61.5 




65.5 






Identity 
(%) 


35.6 








in 






28.5 




29.8 


CO 
CN 


52.7 


61.0 


33.2 


o 


29'. 7 


cn" 

CO 


CD 
•*T 
T 


CD 
Q 
CO 




31.1 




25 ^ 

OJ 
Z3 
C 

c 
o 

30 "~ 

35 

40 


Homologous gene 


JO 

E 
o 
a. 

V) 
CD 
O 

>N 

E 

2 o 

So 

x: < 
o Cl 

CO CO 








Neisseria meningitidis MC58 
NMB0662 ! 






c 
cn 

Z3 
3 
O 

tn 

E 

tn 




Escherichia coli K12 farR 'i 


Beta vulgaris 


Streptomyces coelicolor A3{2) 
SC8F1 1.03c 


Streptomyces coelicolor msdA 


'Bacillus subtilis iolB 


Bacillus subtilis iolO 


Rhizobium meliloti mocC 


Bacillus subtilis idh or iolG 


Bacillus subtilis iolH - 


Streptomyces gfaucescens tcmA 




Bacillus subtilis yvaA 




db Match 


CO 

o' 

in 
CN 

o 
< 

CL 
CO 
CL 
cn 








CO 

o' 

CN 
CN 

o 
o 

UJ 

< 

Cl 
CD 






cn 

CO 

m 

CO 

r*- 

LL 
< 
CL 

cn 




sp:FARR_ECOLI 


s 

i— 

Q. 


CO 

u_ 

CO 

O 
V) 
Cl 
cn 


< 

CO 
CN 

o 

CN 
CN 
t 
CL 


sp:IOLB_BACSU 


sp:IOLD_BACSU 


LU 

2 
X 
(X 

I 

o 
o 
o 

CL 

tn 


sp:MI2D_BACSU 


sp:IOLH_BACSU 


sp:TCMA_STRGA 




sp:YVAA_BACSU 






si 


CN 
T 
CO 


O 
CO 

cn 


in 

CO 


CO 
CO 

cn 


LO 

o 


cn 
ro 
CO 




2067 


ro 

CD 

cn 


cn 
m 
r-> 


1017 


CN 

cn 


1512 


CO 
CO 
CO 


CO 
CN 


m 

CO 


1011 


o 

CO 


1374 


CN 
CO 


1023 


to 
tn 


45 


Terminal 
(nt) 


160370 


161360 


162352 


161363 


162867 


163603 


166457 


163689 


167419 


167837 


169991 


170916 


172444 


173355 


| 175275 


176272 


177318 


178203 | 


179658 


1 

CD 
CO 


180711 


181297 


50 


Initial 
{nt) 


160029 


160431 


161696 


162295 


162463 


162965 


165717 


165755 


166457 


168595 


168975 


96669 L 


170933 


172468 


I 173548 


i 175319 


: 176308 


177334 


178285 


179081 


179689 


180842 




SEQ 

NO. 

(3 3.) 


3668 


3669 


3670 


3671 


3672 


3673 


3674 


3675 


3676 


3677 


3678| 


3679 


3680 I 


3681 


|3682 


3683 


3684 


3685 


3686 


3687 


3688 


3689 1 


55 


SEQ 

NO. 


CO 

CD 


cn 

CO 


o 




CN 

r*» 


CO 

r- 


r-. 


in 
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regulatory protein 


oxidoreductase 


hypothetical protein 




cold shock protein 






caffeoyl-CoA 3-O-methyltransferase 




glucose-resistance amylase 
regulator regulator 






D-xylose proton symporter 




transposase (ISCg2) 


signal-transducing histidine kinase 


glutamine 2-oxoglutarate 
aminotransferase large subunit 
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Streptomyces reticuli cebR 


Rhizobium sp. NGR234 y4hM 


Bacillus subtilis yfiH ' 
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Lactobacillus brevis xylT 




Corynebacterium glutamicum 1 
ATCC 13032tnp is 


Rhizobium meliloti fixL 


1 Corynebacterium glutamicum 
jgltB 


'Corynebacterium glutamicum 
gltD 




Mycobacterium tuberculosis 
H37Rv Rv3698 




db Match 




CO 

r— 

O) 

UJ 

or 

CO 
id. 
cn 


:z 
to 

X 

rr 

X 

> 

d. 

cn 


sp:YFIH_BACSU 




sp:CSP_ARTGO 






prf:2113413A 




sp:CCPA_BACSU 






sp:XYLT_LACBR 




1 

CO 
CO 

< 

Ol 

cn 


LU 
X 

cr 

J 

X 

UL 

cL 

</> 


gp:AB024708_1 


CN 

«■ 

O 
CN 

0 
CO 
< 
id. 
cn 




pir:C70793 






ORF 


CO 

ro 


CO 

o 

CO 


1233 


1011 


01 

CN 


0 

CN 


CO 

in 


to 
0 

CO 




to 

CN 


0 
01 
cn 


CN 
O 


O 
CN 


1473 


O 
O 
CO 


1203 


in 

CO 

■*r 


4530 


1518 


O 
T 

CN 


m 

CO 

rr 


cn 

CO 
CO 


45 


Terminal 
(nt) 


181647 


181687 , 


184051 I 


185087 


185642 


186708 


187302 


187607 


188100 


188300 


188747 I 


! 190321 


190389 


190703 


192949 


194464 


194604 


199769 


201289 


201341 


201760 


205956 


50 


Initial 
(nt) 


181264 


182679 


182819 I 


184077 


185214 


186508 


186769 


187302 


187687 


188725 


189736 


169920 


190628 


192175 


193248 


193262 


195038 


195240 


199772 


201580 


203244 


205588 




So- 


3690 


3691 


3692 


3693 


3694 


3695 


3696 


3697 


3698 


3699 


3700 


3701 


3702 


3703 


3704 


3705 


3706 


3707 


3708 


§ 

CO 


0 

CO 


3711 


55 


SEQ 
NO. 
(DNA) 


o 
cn 




(N 
O) 


CO 
Ol 


Ol 


tn 
cn 


CO 
Ol 


01 


CO 

o> 


01 

cn 


0 
0 

CN 


0 

CN 


[202 


CO 

0 

CN 


O 
CN 


m 
0 

CN 


to 
0 

CN 


r-. 
0 

CN 


CO 

0 

CN 


§ 

CN 


CN 


CN 



45 



EP 1 108 790 A2 



15 



25 



30 



35 



40 



45 



50 



55 



"5 1* 3 

C CO 
CO 0) — ' 



CO 



So 



a O 2 



J3 

E 



Iff 
8 » 

o C£ 

£5 



E 0 

8 > 

o or: 



8 > 

JQ > 

o CC 

2 X 



CD 
o 

>- 



,0) 0 1- 
1- t- CM 1 (N 
CM : (N 



E ^ 

§ CD 

.2 co 



3<£ 

XJ > 

o cr 

>> m 
2 x 



to 

CO 

o 



Is 

" a: 

-Q > 

o cr 

5 X 



CO ^_ 

c «tr 

.Si o 

•2< 

li 
15 



.2 Q. 

.a c 

e 5 

< Q. 



o 
m 
< 



* o 

S 5. 

If 

O £ 



7 o 
O n. 



CO 



Q 



2 




LU 


LU 




£K 


LU 


LU 


>- 


> 


( 

LU 


0' 


CD 


CO 


U- 


U_ 


cr 


cr 


Ql 


CL 


</) 


CO 


OT 




CO 


0 




00 



E 00 

I > 
- » 

o 

5 X 



O) 
CN 
CM 



46 



EP 1 108 790 A2 



5 
10 


Function 




probable electron transfer protein 


amino acid carrier protein 




molybdopterin biosynthesis protein 
moeB (sulfurylase) 


molybdopterin synthase, large 
subunit 


molybdenum cofactor biosynthesis 
protein CB 


co-factor synthesis protein 


molybdopterin co-factor synthesis 
protein 


hypothetical membrane protein 


molybdate-binding periplasmic 
protein 


molybdopterin converting factor 
subunit 1 


maltose transport protein 


hypothetical membrane protein 


histidinol-phosphate 
aminotransferase 
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Mycobacterium tuberculosis 
H37RvRv3571 ! ! 


Bacillus subtilis alsT 1 




Synechococcus sp. PCC 7942 
moeB 


Arthrobacter nicotinovorans 
moaE ; 


Synechococcus sp. PCC 7942 
moaCB 


Arthrobacter nicotinovorans 
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Arthrobacter nicotinovorans > 
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modA . 
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Streptomyces coelicofor A3(2) 
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Zymomonas mobilis hisC 








35 
40 


db Match 




PIR:A70606 


sp:ALST_BACSU 




gp:SYPCCMOEB_ 


o 

CO 
cr> 

CN 

cn 
o 

CN 
*C 


sp:MOCB SYNP7 


prf:2403296C 


CN 

i 

r- 

co 
o 

> 

< 

ii 
cn 


prf: 2403296 F 


UJ 

to 

OJ 
CN 

tn 
o 
-<r 

CN 

t: 

CL. 


pir.D70816 


prt25l8354A 


sp:YPT3_STRCO 


o 

5 
>- 
rsj 

CO 1 

CO 

X 

CL 

tn 










ORF 

\ D P) 


CN 
CO 

LO 


cn 

CN 


(O 


OJ 

o 

CO 


1083 


to 

LO 




CO 

to 


1185 


ro 

CN 


o 
co 


est 

CO 


CN 
OJ 


o 

CN 


1023 


to 

O 
O) 


OJ 
CN 


o 

CN 


45 


Terminal 
(nt) 


221131 


222207 


222210 


CN 
LO 
CN 
CN 


225242 


226312 


226760 


227218 


227703 


228891 


229711 j 


230928 


230931 


231848 


232260 


234818 


234910 


235409 


50 


Initial 
(nt) 


221712 


221911 


223685 


224336 


226324 


I 

226767 


227230 


227685 


228887 


229613 


230514 j 


230608 


231842 


232267 


233282 


233913 j 


235203 


235290 




SEQ 

NO. 
(a.a.) 


3731 


3732 


3733 


CO 

cn 


3735 


3736 


3737 


3738 


3739 


3740 


3741 


3742 


CO 

*r 

CO 


t 

" 


3745 


3746 


3747 


3748 


55 


yj Z D ' tN 


CN 

co 

CN 


cn 
cn 

CN 


-n- 
cn 

CN 


LO 

ro 

CN 


236 


r- 
f ^ 

I ~ 


CO 

cn 

CN 


CO 
CO 

CN 


o 

CN 


CN 


CN 
CN 


CO 
CN 


rr 
CN 


LO 
T 
CN 


to 


CN 


CO 
CN 



47 



EP 1 108 790 A2 



15 



25 



30 



35 



40 



45 



50 



ro cu " — ■ 



E — 



CD 



CD 

x: 

Si 

to en 

CD CO 
CD Q 



CO 

m 

i 

CN 

X 
Q 
< 

CL 



St 



So ™i? 

w 2 » ?; 



u < 

XI > 

o q: 

£5 



2 X 



o 

a. 



E 



XI > 

o or 

o r- 

5 X 



3 




CO 




O 




< 




CO 




I 

LU 




>- 




00 




CL 




on 




O) 


o 


r-- 


CD 


oo 


CD 



CO 

m 

oo ! t 
CO j u"> 



' r- co 

■ CD CO 

r CM j OJ 



48 



EP 1 108 790 A2 



10 



15 



20 



25 



30 



CO <L> 



if) 



c 
o 



40 



O = 



50 



55 



S9 2 



CO ^ o 



< 

CL 



5 



>> CL — 

5 ^ 



=3 

E 



3 m 



CO 
o 



(J 
E 



o 



!->- j r- 

CM CM 



CN 



E 

■i O 



o tr 



<D 
O 
Q. 



.2 E 



CO u t 
-O ra o 

5^ 



cr 
O 
o 



CO 

to 

CN 



tr 
o 
o 



IP 

a i 

a> in 



2 
CL 
LU 



CL 



o 
< 



Jc, 



49 



EP 1 108 790 A2 



5 



10 



15 



20 



25 



CO 



35 



40 



45 



50 



55 



Function 






metalloregulatory protein 


arsenic oxyanion-translocation pump 
membrane subunit 


arsenate reductase 








Na+/H+ antiporter or multiple 
resistance and pH regulation related 
protein D 


Na+/H+ antiporter 


Na+/H+ antiporter or multiple 
resistance and pH regulation related 
protein A 








transcriptional activator 


two-component system sensor 
histidine kinase 


alkaline phosphatase 




phosphoesterase 


hypothetical protein 
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Sinorhizobium sp. As4 arsR 


Sinorhizobium sp. As4 arsB 


Staphylococcus xylosus arsC 








Bacillus firmus OF4 mrpD 


Staphylococcus aureus mnhC 


Bacillus firmus OF4 mrpA 
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Mycobacterium tuberculosis 
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Function 


class A penicillin-binding 
protein(PBPI) 


regulatory protein 




hypothetical protein 


transcriptional regulator 


shikimate transport protein 




long-chain-fatty-acid-CoA ligase 


transcriptional regulator 


3-oxoacyl-( acyl -carrier-protein) 
reductase 


glutamine synthetase 


short-chain acyl CoA oxidase 


nodulation protein 


hydrolase 






cAMP receptor protein 




ultraviolet N-glycosylase/AP lyase 


cytochrome c biogenesis protein 
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Table 1 (continued) 


Homologous gene 


Mycobacterium leprae pon1 


Streptomyces coelicolor A3(2) 
whiB 




Streptomyces coelicolor A3(2) 
SCH17.10c 


Mycobacterium tuberculosis 
H37Rv Rv3678c 


Escherichia coli K12 shiA 




Bacillus subtilis IcfA 


Streptomyces coelicolor A3(2) 
SCJ4.28c 


Bacillus subtilis fabG 


Emericella nidulans fluG 
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Rhizobium leguminosarum nodN 


Mycobacterium tuberculosis 
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Function 


hypothetical protein 


serine proteinase 


epoxide hydrolase 


hypothetical membrane protein 


phosphoserine phosphatase 


hypothetical protein 


conjugal transfer region protein 




hypothetical membrane protein 


hypothetical protein 


hypothetical protein 








ATP-dependent RNA helicase 


cold shock protein 




DNA topoisomerase \ 
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Mycobacterium tuberculosis ,, 
H37Rv Rv3669 


Mycobacterium leprae 1 
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Function 


adenylate cyclase 


DNA polymerase III subunit 
tau/gamma 




hypothetical protein 


hypothetical protein 


ribosomal large subunit 
pseudouridine synthase C 


beta-glucosidase/xylosidase 


beta-glucosidase 


NAD/mycothiol-dependent 
formaldehyde dehydrogenase 




metallo-beta-lactamase superfamily 


3-oxoacyl-<acyl-carrier-protein} 
reductase 


valanimycin resistant protein 


dTDP-glucose 4,6-dehydratase 


hypothetical protein 


dolichol phosphate mannose 
synthase 




nucleotide sugar synthetase 


UDP-sugar hydrolase 
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length 
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; 

25.9 


26.3 | 


CO 
CO 


59.3 
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Homologous gene 


Stigmatella aurantiaca B17R20 
cyaB 


Bacillus subtilis dnaX 




Ureaplasma urealyticum uu033 


Deinococcus radiodurans 
DR0202 


Escherichia coli K12 rluC 


Erwinia chrysanthemi D1 bgxA 


Azospirillum irakense salB 


Amycolatopsis methanolica 




Rhodococcus erythropolis orf5 


Escherichia coli K12 fabG 


Streptomyces viridifaciens vlmF 


Actinoplanes sp. acbB i 


Mycobacterium tuberculosis 
H37Rv Rv3632 


Methanococcus jannaschii JAL- 
1 MJ1222 




Escherichia coli K12 yefJ '; 


Salmonella typhimurtum ushA 




db Match 


sp:CYAB_STIAU 


sp:DP3X_BACSU 




CO 

c 1 

o 

CM 
O 

o 

LU 

< 
Cl 
cn 


gp:AE001882_8 


sp:RLUC_ECOL! 


X 

u 

<: 
cn 

LU 

t 

CD 
m 

Cl 
tn 


CM 

i 

Oi 
CM 

o 
cn 
o 

Ll_ 

< 

CL 

co 


LU 

>- 
< 

J 

X 

a 
< 

LL 

D_ 

CO 




to 
O 
X 

or 

i 

in 
X 
t— 
>- 

id. 

CO 


sp:FABG_ECOLI 


i 

CM 
CM 
CO 
CO 
T 

LL. 
< 

id. 

CJ) 


prf:2512357B 


pir:A70562 


< 

LU 

CM 1 
CM 

o 
> 

tn 




i 

o 
o 

UJ 

1 

— > 

LL 
LU 
> 

id. 

CO 


sp:USHA_SALTY 




n 


1041 


1257 


CM 
CO 




CD 
ID 


CM 

co 

CO 


1644 


1989 


1104 


CM 
CO 


h- 

CO 

m 


cn 
to 

CO 


1230 


ro 
to 
cn 


in 
ro 


cn 
m 

r*- 


1029 


1035 


2082 


CM 
CD 


Terminal 
(nt) 


326695 


329539 


329909 


330376 


331533 


332433 


334562 


334953 


336112 


! 335185 


336748 


337449 


338768 


339725 


340195 


340569 


342375 


343451 


345717 


345814 


Initial 
(nt) 


327735 


328283 


329748 


329933 


330973 


331552 


| 332919 


332965 


335009 


335805 


336212 


336781 


337539 


338793 


340569 


341327 


341347 


CM 

•*r 
cn 


343636 


345975 


SEQ 
NO. 

(a. a.) 


3849 


3850 


3851 


3852 


3853 


3854 


3855 


3856 


3857 


3858 


3859 


3860 


3861 


3862 


3863 


3864 


3865 


3866 


3867 


3868 


SEQ 
NO. 
(DNA) 


cn 
*r 

CO 


o 

ID 

ro 


in 

n 


CM 
LO 

c 


ro 
tn 

« 


m 

CO 


ID 
ID 
CO 


CO 

8 


r- 

m 

CO 


co I cn 
in ; in 
ro I co 


S 

CO 


CO 

ro 


CM 
CO 
ro 


363 


■^r 
co 
ro 


m 

CD 
CO 


CO 
CD 
CO 


r— 

CO 
CO 


CO 

ro 



53 



EP1 108 790 A2 



Z3 











dTDP-4-keto-L-rhamnose reductase 


ase 








lein 
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lipopolysaccharide biosynthesis / 
aminotransferase 


Function 




NADP-dependent alcohol 
dehydrogenase 


glucose-1 -phosphate 
thymidylyltransferase 


dTDP-glucose 4,6-dehydrat 


NADH dehydrogenase 


Fe-regulated protein 




hypothetical membrane prol 


metallopeptidase 


prolyl endopeptidase 




hypothetical membrane prot 


r 

cell surface layer protein 


autophosphorylating protein 
kinase 


protein phosphatase 




capsular polysaccharide 
biosynthesis 


ORF 3 


Matched 
length 
(aa) 




CO 
T 

ro 


to 

CO 
CN 


OJ 

cn 


ro 

T 

ro 


to 
o 

CN 


LO 

CN 
CO 




CO 
CN 


to 


CO 

o 
f- 




CO 
to 

CN 


CO 

to 
ro 


CO 

to 


CN 

o 




ro 
CO 


o 

CO 


cn 

CO 


Similarity 
(%) 




74.9 


84.9 


74.0 


CO 
CO 


61.2 


66.5 




68.3 


62.5 


56.4 




46.0 


76.6 


57.2 


68.6 




65.7 


51.0 


CO 
CO 
CO 


Identity 
{%') 




52.2 


ro 

oi 

CO 


-L.O-. 

cn 

N" 


61.13 


tb 
ro 


CO 

ro 




CO 


34. i 


CO 
CN 




26.0 I 


50.7 


Lf) 

CO 
CM 


39.2 | 




33.0 


o" 


37.1 


Homologous gene 




Mycobacterium tuberculosis 
H37Rv adhC 


Salmonella anatum M32 rfbA 


Streptococcus mutans rmIC 


Streptococcus mutans XC rmlB 


Thermus aquaticus HB8 nox 


Staphylococcus aureus sirA 




Mycobacterium tuberculosis 
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Streptomyces coelicolor 
SC5F2A.19c 


Sphingomonas capsulata 




Streptomyces coelicolor A3(2) 


Corynebacterium 
ammoniagenes ATCC 6872 


Acinetobacter johnsonii ptk 
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Vibrio cholerae 
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CD 


























Function 


dihydrolipoamide dehydrogenase 


UTP-glucose-1-phosphate 
uridylyltransferase 


regulatory protein 


transcriptional regulator 


cytochrome b subunit 


succinate dehydrogenase 
flavoprotein 


succinate dehydrogenase subunit 












hypothetical protein 


hypothetical protein 






tetracenomycin C transcription 
repressor 




transporter 
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length 
(a.a) 


CO 

<o 


tn 

CO 
CN 


CO 

m 




O 
CO 
CN 


CO 

o 

CO 


CO 

m 

CN 












CO 

m 

CN 


CO 






r- 

CO 




CO 
CO 










































Similaril 
(%) 


100.0 


68.1 


71.9 


81.3 


CO 


61.2 


56.2 












49.8 


64.3 






53.8 




74.6 










































. Co" . 


99.6 


41.7 


4318 


57)0 


Op 
co 
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- 


26.3 
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36.1 


Homologous gene 


Corynebacterium glutamicum 
ATCC 13032 Ipd 


Xanthomonas campestris 


Pseudomonas aeruginosa PA01 
orfX 


Mycobacterium tuberculosis 
H37Rv Rv0465c 


Streptomyces coelicolor A3(2) 
SCM10.12c 


Bacillus subtilis sdhA 


Paenibaciltus macerans sdhB 












Streptomyces coelicolor 
SCC78.05 


Escherichia coli K1 2 yjiN 






Streptomyces glaucescens 
GLAOtcmR 




Streptomyces fradiae T#2717 
urdJ 


db Match 


gp:CGLPD_1 


pir:JC4985 


CN 

1 

to 
to 
to 

CO 

3 
< 

CL 
CL 

cn 


pir:E70828 


CN 
O 

i 
a 

CO 
Q. 
CO 


pir:A27763 


< 
u 

X 
Q 
CO 

m 

CL 
CO 












gp:SCC78_5 


sp:YJIN_ECOLI 






sp:TCMR_STRGA 




to 

CO 
t 

.to 

LL 
< 
Ql 
CO 


LL <— 

O ^ 


1407 


CN 

cn 


CO 
CO 


1422 




1875 


r-- 

CO 
CO 


to 

CO 
CO 


to 

CN 


o 

CO 

to 


to 

CO 


CO 
CO 
CO 


m 
r- 

CO 


1251 


O 
CN 
•<T 


ro 
O 
CO 


CO 

to 


•c- 
O 
CN 


1647 


Terminal 
(nt) 


389098 


390168 


390730 


390787 


393475 


395513 


396262 


396650 


396932 


396411 


397825 


398222 


397232 


399579 


400017 


400341 


401150 


401253 


402796 


Initial 
(nt) 


J 387692 


389248 
- 


390233 


392208 


392705 


393639 


395426 


396315 


396672 


397040 


397730 


397884 


398206 


398329 


399598 


400039 


400473 


401050 


o 
m 

o 

T 


CO 2 2, 


3908 


3909 


3910 


3911 


39121 


3913 


3914 


3915 


3916 


3917 


3918 


3919 


3920 


3921 


3922 


3923 


3924 


3925 


3926 


SEQ 
NO. 


CO 

o 


CO 

o 


o 




CN 
T 


CO 
T 




in 


i 

to 
-<r 




CO 
T 

I 


CO 

5 


o 

CN 


_l 

CN I 
^ | 


CN 
CN 


CO 
CN 


T 

CN 


in 

CN 


to 

CN 

1 



56 



EP 1 108 790 A2 



5 

10 


Function 


transporter 


formyltetrahydrofolate deformylase 


deoxyribose-phosphate aldolase 






hypothetical protein 


hypothetical protein 




cation-transporting P-type ATPase B 




glucan 1,4-alpha-glucosidase 


hemin-binding periplasmic protein 


ABC transporter 


ABC transporter ATP-binding protein 


hypothetical protein 


hypothetical protein 
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Homologous gene 


Streptomyces fradiae T#2717, 
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Corynebacterium sp. P-1 purU 


Bacillus subtilis deoC 






Mycobacterium avium GIR10 
mav346 


Mycobacterium tuberculosis 
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Saccharomyces cerevisiae 
S288C YIR019Csta1 


Corynebacterium diphtheriae 
hmuT 


Corynebacterium diphtheriae 
hmuU 


Corynebacterium diphtheriae 
hmuV 


Streptomyces coelicolor C75A 
SCC75A.17c 


Streptomyces coelicolor C75A 
SCC75A.17C 
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hypothetical protein 






phosphoserine phosphatase 


hypothetical protein 




glutamyMRNA reductase 


hydroxymethylbilane synthase 




cat operon transcriptional regulator 


shikimate transport protein 


3-dehydroshikimate dehydratase 


shikimate dehydrogenase 




putrescine transport protein 




iron{lll)-transport system permease 
protein 




periplasmic-iron-binding protein 


uroporphyrin-lll C-methyltransferase 
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Homologous gene 
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Mycobacterium tuberculosis 
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Acinetobacter calcoaceticus 
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Corynebacterium glutamicum 
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Function 


transcriptional repressor 


adenylate kinase 




methionine aminopeptidase 




translation initiation factor IF-1 


30S ribosomal protein S13 


30S ribosomal protein S1 1 


30S ribosomal protein S4 


RNA polymerase alpha subunit 




| SOS ribosomal protein L17 


pseudouridylate synthase A 


hypothetical membrane protein 






hypothetical protein 


cell elongation protein 


cyclopropane-fatty-acyl-phospholipid 
synthase 


hypothetical membrane protein 


Matched 
length 
(aa) 


to 
m 

CM 


CO 




ro 
in 

CN 




CN 

r*- 


CN 
CN 


T 

CO 


CN 
CO 


ro 




CN 
CN 


in 
to 

CN 


to 

CO 
r- 






m 

CO 


509 


CO 
CM 


o 
o 


Similarity 
(%) 


o 

CO 

to 


81.0 




r- 




86.0 j 


91.0 


93.3 


93.9 


1 77.8 




77.1 


61.1 


■ 51.2 






53.8 


cn 

s 


56.0 


o 
cri 
in 


Identity 

(%) ; 


m 

CO 
CM 


48)9 




cn 




C> 


to 
to 


81.3 


CN 
CO 


LO 






37.6 


CN 






T 

CN 


22.8 | 


O 
CO 


28.0 


Homologous gene 


Erwinia carotovora carotovora 
kdgR 


Micrococcus luteus adk 




Bacillus subtilis 168 map 




Bacillus subtilis infA 


Thermus thermophilus HB8 i| 
rps13 


Streptomyces coelicolor A3(2) 
SC6G4.06. rpsK 


Mycobacterium tuberculosis 
H37Rv RV3458C rpsD 


| Bacillus subtilis 168 rpoA 




Escherichia coli K12 rplQ 


Escherichia coli K1 2 truA 


Mycobacterium tuberculosis 
H37Rv Rv3779 






I Mycobacterium tuberculosis 
H37Rv Rv0283 


Arabtdopsis thaliana CV DIM 


Escherichia coli K12 cfa 


Streptomyces coelicolor A3(2) 
SCL2.30c 


db Match 


< 

CO 

o 

CO 

CM 

tn 

CM 

t: 
o_ 


sp:KAD_MICLU 




ZD 
CO 

o 
< 

=' 

Ol 
< 

CL 
CO 




T 

to 
cn 

CO 
LL 

u 

CL 


prf:2505353B 


sp:RS11_STRCO 


LU 
r- 

CO 
CN 

T- 

CN 
CN 

ir 

CL 


sp:RPOA_BACSU 




sp:RL17_ECOLI 


sp:TRUA_ECOLI 


m 
cn 
CO 

o 

r*- 
O 

Q. 






pir:A70836 


X 

< 

5 

Q 
b. 

tn 


_j 
O 
o 

LU 

<' 

LL 

o 

CL 

CO 


o 

CO 

i 

CN 
_J 

O 
tn 

CL 

cn 


SI 


O 
CO 


co 

T 

m 


CN 
CO 


CM 
CD 

r*- 


CO 
CN 
CO 


to 

CN 


to 
to 
cn 


CM 
O 


CO 
O 

to 


1014 


to 
tn 


cn 

CO 


to 

CO 


2397 


to 

LO 


CO 

o 

CO 


1257 


1545 


1353 


to 

CN 


Terminal 
(nt) 


568272 


571316 


570756 


572267 


573176 


573622 


I 574181 


574588 


575217 


I 576351 


i 575211 


| 576898 


1 577923 


1 580429 


580436 


580919 


582662 


584228 


585620 


586248 


Initial 
(nt) | 


569075 


570774 


571367 


CO 

*r 

r- 
m 


572349 


573407 I 


573816 


574187 


574615 


I 575338 I 


575366 


576410 


577057 


578033 


580891 


CN 
CN 

CO 

in 


581406 


582684 I 


584268 


585823 


SEQ 
NO. 
(a.a.) 


CO 


cn 

T— 


4120 


4121 


CN 
CN 

i 


4123 


CM 


in 

CN 


CO 
CM 


CN 
T 


CO 
CN 


4129 


o 

ro 
T 


4131 


CN 
ro 


4133 


4134 


4135 


4136 


cn 


SEQ 
NO. 
(DNA) 


CO 

to 


cn 

CO 


CM CN 

to to 


CN 
CM 
CO 


CO 
CN 
CO 


^ m 

CM 1 CN 
to • CO 

f 


CO 
CN 

to 


CM 

to 


CO 
CN 
CO 


cn 

CN 

to 


O 
CO 

to 


CO 

to 


CN 
ro 
CO 


CO 
CO 

to 


•*r 
co 
to 


LO 

ro 
to 


to 

CO 
CO 


CO 

to 
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Function 


hypothetical membrane protein 


proline iminopeptidase 


hypothetical protein 


ribosomal-protein-alanine N- 
acetyltransferase 


O-sialoglycoprotein endopeptidase 


hypothetical protein 






heat shock protein groES 


heat shock protein groEL 


hypothetical protein 


hypothetical protein 


regulatory protein 


RNA polymerase sigma factor 




hypothetical protein 


/MP dehydrogenase 


hypothetical protein 


Matched 
length 
(a.a) 


o 

LO 

lO 




r— 
o 

CN 


CM 

to 


CO 
CO 


r*- 
cn 






o 
o 


r- 
co 
m 


CO 


CO 
CO 


CO 






CO 


O 

in 


CD 


Similarity 
(%) 


66.2 


77.6 


75.4 


59.9 


75.2 


59.4 






94.0 


85.1 


56.0 


45.0 


88.3 


81.6 




69.8 


93.9 


53.0 


Identity 
('%) 


CO 

oo_ 

CN 


CO 
LO 


52.2 


30.3 


•CO 


cc 

CO 






76.0 


63.3 


50.0 | 


o 

CO 


CO 
CO 


55J2 




5 r 


8,08 


39.0 | 


Homologous gene 


Escherichia coli K12 yidE 


Propionibacterium shermanii pip 


Mycobacterium tuberculosis 
H37RvRv3421c 


Escherichia coli K12 riml 


Pasteurella haemolytica 
SEROTYPE A1 gcp 


Mycobacterium tuberculosis 
H37Rv Rv3433c 






Mycobacterium tuberculosis 
H37Rv RV3418C mopB 


Mycobacterium leprae 
B229_C3_248 groE1 


Mycobacterium tuberculosis 


Mycobacterium tuberculosis 


Mycobacterium smegmatis 
whiB3 


Mycobacterium tuberculosis | 
H37RvRv3414csigD 




Mycobacterium leprae 
B1620_F3_131 


Corynebacterium 
ammoniagenes ATCC 6872 
guaB 


Pyrococcus horikoshii PH0308 


db Match 


sp:YIDE_ECOLI 


gp:PSJ00161_1 


ZD 
t- 
O 

>- 

co 1 

CO 

o 
>- 

ol 
to 


i 

o 
o 

LU 

1 

or 

CL 

in 


sp:GCP_PASHA 


ZD 
i— 
O 

> 
cn' 

CL 

%n 






sp:CH10_MYCTU 


sp:CH61_MYCLE 


GP:MSGTCWPA_1 


CO 

«■ 

CL 

o 

1— 

CD 

CO 
CL 

O 


o' 
o 

CO 
CO 

r- 
O 

LL 
< 

C7) 


sp:Y09F_MYCTU 




LU 
-J 

o 
>- 

i 

X 

CO 

o 

>- 

CL 

to 


in 

CO 

o 
o 

CO 

< 
cL 
cn 


PIR:F71456 


81 


1599 


1239 


in 

CO 


r- 
o 
m 


1032 


1722 


CO 

rg 


CO 

in 

TT 


CO 
CN 


1614 


m 
in 

CN 


1158 


1 — 
CO 
CN 


CO 

in 


1026 


CO 

r-- 
co 


1518 


r-- 

CN 
CD 


Terminal 
(nt) 


604409 


605708 


606392 


606898 


607936 


609679 


610175 


609816 


■<r 

CO 

o 

CO 


612272 


610946 


611109 


612418 


613719 


614747 


614803 


616853 


615605 


Initial 
(nt) 


602811 


604470 


605718 


606392 


606905 


607958 


609747 


610268 


co 

T 

n 

Q 
CO 


610659 


611200 


612266 


612714 


613156 


613722 


615180 


615336 


616231 


SEQ 
NO. 
(a.a.) 


4156 


4157 


4158 


4159 


o 
to 


4161 


CN 
CD 


4163 


CO 


4165 


4166 


4167 


4168 
4169 


4170 


4171 


4172 


CO 

^ ; 
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NO. 
(DNA) 


CO 

in 
to 


r- 
m 
to 

L 


CO 

in 

CO 


CO 

to 

CO 


o 
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CO 
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CO 
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CO 
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Function 


hypothetical membrane protein 


phytoene desaturase 


phytoene synthase 


transmembrane transport protein 


geranylgeranyl pyrophosphate 
(GGPP) synthase 


transcriptional regulator (MarR 
family) 


outer membrane lipoprotein 


hypothetical protein 


DNA photolyase 


glycosyl transferase 


ABC transporter 


ABC transporter 




ABC transporter 




ABC transporter 


lipoprotein 


DNA polymerase III 


hypothetical protein 


Matched 
length 


in 

CO 


XT 
OJ 

in 


CO 
CO 
OJ 


OJ 
CM 


r- 

CO 
CO 


cO 

CO 


m 

T 


CM 
CO 


O) 

■^r 


m 
o 

CM 


cn 
co 


CO 
CM 
CM 




ID 

o 

CM 




to 
co 


CO 

ID 
CM 


1101 


CO 

m 


Similarity 


67.4 


76.2 


71.2 


75.6 


63.8 


CO 

to 


62.1 


74.2 


CM 
CO 

to 


53.7 


54.9 


72.2 




75.2 




75.4 


67.2 


57.5 


62.3 


Identity 
(!%) 


00 

ro 


_o . 
in 


42.0 


43.6 


32.7 


CO 
CO 


33.1 


r- 


40.0 


cn 

CM 


CO 
CM 


! 35.4 




35.9 




CO 
CO 


r-- 
co 

CM 


CM 

O 
CO 


in 
*r 


Homologous gene 


Mycobacterium marinum 


Brevibacterium linens ATCC " 
9175 crti 


Brevibacterium linens ATCC 
9175 crtB 


Streptomyces coeticolor A3(2) 
SCF43A.29C 


Brevibacterium linens crtE 


Brevibacterium linens 

ii 


Citrobacter freundii b!c OS60 blc 


Brevibacterium linens 


Brevibacterium linens ATCC 
9175 cpdl 


Streptococcus suis cpslK 


Streptomyces coelicolor A3(2) 
SCE25.30 i 


Bacillus subtilis 168 yvrO 




Helicobacter pylori abcD 




Escherichia coli TAP90 abc 


Haemophilus influenzae 
SEROTYPE B hlpA 


Thermus aquaticus dnaE 


Streptomyces coelicolor A3(2) i' 
SCE126.11 


db Match 


CO 

i 

in 
r*- 
O 
CN 
O) 

CL 

cn 


gp:AF139916_3 


gp:AF139916_2 


gp:SCF43A_29 


gp:AF139916_11 


gp:AF139916_14 


cr: 

LL 

t— 

°. 

o 
_j 

CO 

CL 

CO 


to 
cn 

CO 
CO 

LL 
< 
Cl 
cn 


gp:AF139916_5 


f 

o 
oo 
in 
m 

LL 

< 

CL 
CO 


o 

ro 

«' 

CM 
LU 
U 
if) 
cL 
cn 


prf; 242041 OP 




o 

co 

8 

CM 
CO 
CM 

tr 

Cl 




sp:ABC_ECOU 


sp:HLPA_HAEIN 


< 
to 

CO 
CO 

in 

CM 

t: 

CL 


CM 

LU 

CJ 
C/3 

cL 
co 


ORF 

( D P) 


CD 

cn 

CO 


T 

CO 


CM 
CD 


2190 


1146 


tn 
oo 
m 


GO 
T 

to 


in 

CM 
V 


1404 


CO 

m 
r- 


2415 




CO 

in 


CO 
CD 
CO 


CO 
CO 


1080 


cn 

CD 


3012 


r-- 
t 


Terminal 
(nt) 


633079 


633532 


635178 


636089 


638317 


640208 


640232 


642557 


642556 


644778 


645176 


647593 


648315 


O 

■*r 
CO 
T 
CO 


650187 


CO 

to 


650392 


654612 


655122 


Initial 
(nt) 


633474 


635175 


636089 


638278 


639462 


639624 


cn 
r- 

CO 

O 

TT 

to 


641133 


643959 


CO 
CM 
O 
■*r 

T 
CD 


647590 


648309 


648467 


649105 


649342 


650193 


651288 


651601 


654676 


SEQ 

NO. 
(a.a.) 


4193 


OJ 


4195 


4196 


4197 


4198 


4199 


4200 


4201 


4202 


4203 


*<r 
o 

CM 
T 


4205 


4206 


4207 


4208 


cn 
o 

CM 


4210 


4211 


SEQ 
NO. 
(DNA) 


CO 

cn 

CD 


cn 

ID 
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CO CO 


cn 

to 
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Function 


hypothetical membrane protein 




transcriptional repressor 


hypothetical protein 




transcriptional regulator (Sir2 family) 


hypothetical protein 


iron-regulated lipoprotein precursor 


rRNA methylase 


methylenetetrahydrofolate 
dehydrogenase 


hypothetical membrane protein 


hypothetical protein 




homoserine O-acetyltransferase 


O-acetylhomoserine sulfhydrylase 


carbon starvation protein 




hypothetical protein 




Matched 
length 
(a.a) 


CO 

to 




CO 
O 
CN 


CO 
CN 




in 

CN 


in 


in 

CO 


m 


CO 

■ r-* 

CN 


O 

CO 


CO 
CO 




CO 

co 


CO 
CN 


o 

CO 
CD 




OS 












































Similari 
(%) 


56.0 




76.4 


61.7 




71.8 


78,3 


62.2 


86.1 


87.4 


76.3 


1 63.2 




99.5 


76.2 


78.4 




o 
to 

CO 




Identity 


to 




CO 

o 

'UJ 


34.9 




42.5 


45.2 


CO 


CO 

to 


70.9 


31.3 


o 

CO 




m 

CO 
CO 


CO 


CO 
CO 

m 




_o. . 
o 




Homologous gene 


Streptomyces coelicolor A3(2) 
SCE9.01 




Mycobacterium tuberculosis 
H37Rv Rv2788 sirR 


Streptomyces coelicolor A3(2) 
SCG8A.05c 




Archaeoglobus futgidus AF1676 


Streptomyces coelicolor A3(2) 
SC5H1.34 


Corynebacterium diphtheriae 
irp1 


Mycobacterium tuberculosis 
H37Rv Rv3366 spoU 


Mycobacterium tuberculosis 
H37Rv Rv3356c folD 


Mycobacterium leprae 
MLCB1779.16C 


Streptomyces coelicolor A3(2) 
SC66T3.18c 




Corynebacterium glutamicum 
metA 


Leptospira meyeri metY 


Escherichia coli K12 cstA 




Escherichia coli K1 2 yjiX 




db Match 


gp:SCE9J 




pir;C70884 


gp:SCG8A_5 




pir:C69459 


ro 

X 
in 
O 
CO 

ol 
cn 


gp:CDU02617_1 


CO 
O 
r-~ 
UJ 

u 

CL 


pir:C70970 


gp:MLCB1779_8 


gp:SC66T3_18 




gp:AF052652_1 


prf:2317335A 


_j 
O 
o 

UJ 

JS' 

to 

CJ 
Cl 

CO 




sp:YJIX_ECOLI 






1413 


00 


CO 
ID 
CD 


CO 
CO 


CO 
CO 


t 
r- 

r-- 


CN 
CO 


to 

CO 
CO 




CN 

in 
CO 


m 
in 

CN 


1380 


CO 

to 

CO 


1131 


1311 


2202 


CO 

o 
to 


o 

CN 


609 ] 


Terminal 
(nt) 


656534 


655097 


657215 


657205 


658142 


658928 


659424 


660538 


660650 


662017 


662374 


662382 


664126 


665183 


666460 


670465 


669445 


670672 


671045 


Initial 
(nt) 


655122 


I 655834 


656547 


658002 


658005 


658155 


658933 


659543 


661120 


661166 


662120 


j 663761 


665088 


666313 


667770 


668264 


670053 


670472 


671653 


oo ^ 3, 


CN 

CN 
•*T 


4213 


CM 


in 

CN 


4216 


4217 


4218 


CO 

CN 
T 


4220' 


4221 


4222 


CO 
CN 
CN 


CN 
CN 


4225 


4226 


4227 


CO 
CN 
CN 
•<T 


4229 


4230 


SEQ 
NO. 
(DNA) 


CN 


CO 




m 


to 




5 


CO 
P- 


o 

CN 

r-- 


CN 


CN 
CN 


CO 
CN 


CM 

r-- 


m 

CN 

r- 


to 

CN 
r*- 


CN 


CO 
CN 

r*- 


CO 
CN 


o! 

CO 
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Function 


hypothetical protein 


carboxy phosphoenolpyruvate 
mutase 


citrate synthase 




hypothetical protein 




L-malate dehydrogenase 


regulatory protein 




vibriobactin utilization protein 


ABC transporter ATP-binding protein 


ABC transporter 


ABC transporter 


iron-regulated lipoprotein precursor 


chloramphenicol resistance protein 


cataboiite repression control protein 


hypothetical protein 




15 




Matched 
length 
(aa) 


CO 


co 

CM 


o 

CO 
ro 




CO 

m 




CO 
CO 
CO 


to 

CN 
CM 




CO 
CN 


O) 

to 

CM 


Ol 

CO 
CO 


o 
cn 

CO 


CO 

in 
co 


m 

Ol 
CO 


CO 

o 

CO 


Ol 

CN 




20 




Similarity 


86.4 


76.2 


81.3 




62.3 




tn 
to 


62.8 




CM 

m 


85.1 


86.4 


88.2 


82.3 


9*69 


58.1 


85.8 








Identity 

<;%} 


71.0 


to 


m 




q 

CO 




37.6 


to 

CN 
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tn 


CO 
CD 

m 
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Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv1130 


Streptomyces hygroscopicus 
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Methanothermus fervidus V24S 
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hypothetical protein 


thioredoxin reductase 


PrpD protein for propionate 1 
catabolism 


carboxy phosphoenolpyruvate 
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Table 1 (continued) 
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Bacillus subtilis 168 yciC 


Bacillus subtilis IS58 trxB 


Salmonella typhimurium LT2 
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Streptomyces hygroscopicus 
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hypothetical protein 


dTDP-Rha:a-D-GlcNAc- 
diphosphoryl polyprenol, a-3-L- 
rhamnosyl transferase 


mannose-1-phosphate 
guanylyltransferase 


regulatory protein 


hypothetical protein 


hypothetical protein 


phosphomannomulase 
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pheromone-responsive protein 
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two-component system response 
regulator 




two-component system sensor 
histidine kinase 


lipoprotein 


hypothetical protein 




30S ribosomal protein or chloroplast 
precursor 


preprotein translocase SecA subunit 




hypothetical protein 
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Table 1 (continued) 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv3246c mtrA 




Mycobacterium tuberculosis 
H37Rv Rv3245c mtrB 


Mycobacterium tuberculosis ' 
H37Rv Rv3244c IpqB 


Mycobacterium tuberculosis 
H37Rv Rv3242c 




Spinacia oleracea CV rps22 ' 


Brevibacterium flavum 
(Corynebactertum glutamicum) 
MJ-233 secA 




' Mycobacterium tuberculosis 
H37RvRv3231c 


Mycobacterium tuberculosis 
H37Rv Rv3228 


Corynebacterium glutamicum 
AS019aroA 


Mycobacterium tuberculosis 
H37RV Rv3226c 


Corynebacterium glutamicum 


Mycobacterium tuberculosis 
H37Rv Rv0336 


Mycobacterium tuberculosis « 
sigH 


40 


db Match 


prf:2214304A 




CO 

o 
n 

CN 
CN 

f 
CL 


CN 

cn 

LO 

o 

r^- 

LL 

i— .' 
CL 


pir:D70592 




sp:RR30_SPIOL 


gsp:R74093 




pir:A70591 


pir:F70590 


gp:AF114233_1 


pir:D70590 


CO 
CO 
CN 

UL 
< 
CL 

O 


pir:G70506 


Q 

CO 
CO 
CO 

m 
in 

CN 
Q. 




n 


GO 

r-- 

CO 


CO 

to 


1497 


1704 


CO 
CO 
LO 


CO 

LO 


CO 

to 
to 


2535 

i 


CN 

to 


O 
LO 


r-- 

CO 

cn 


1413 


o 

CO 

-cr 


CO 
CN 


1110 


CO 
CO 


45 


Terminal 
(nt) 


791409 


790738 


793008 


794711 


795301 


795292 


796110 


798784 


799691 


800200 


800208 


801190 


803128 


802565 


803131 


805025 


50 


Initial 
(nt) 


790732 


791421 


791512 


793008 


794714 


LO 

cn 


CO 
T 
•<T 

in 

cn 


796250 


799020 


799697 


801194 


802602 


802649 


802687 


o 

CN 
"<T 
O 
CO 


804408 




SEQ 

NO. 
(aa.) 


o 

■ST 

co 


n 


CN 

m i 


co 

CO 


co 


in 

T 
CO 
TT 


to 
■^r 
co 

T 


r- 
■^r 
co 

TT 


4348 


4349 


4350 


LO 
CO 
<T 


4352 


4353 


4354 


4355 


55 


SEQ 
NO. 
(DNA) 


o 

CO 


CO 


CN 
CO 


CO j 

co 


CO 


LO 
GO 


to 

T 

CO 


r-. 
co 


CO 
CO 


cn 

CO 


o 
in 

CO 


m 
co 


CN 

m 

CO 


CO 

in 

CO 


in 

CO 


1 

m 
m 

» 



79 



EP 1 108 790 A2 



5 



10 



15 



20 



35 



40 



45 



50 



55 



Function 


regulatory protein 


hypothetical protein 


hypothetical protein 


DEAD box ATP-dependent RNA 
helicase 
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ATP-dependent DNA helicase 
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Function 


hypothetical protein 


hypothetical protein 






hypothetical protein 


regulatory protein 


ethylene-inducible protein 


hypothetical protein 


hypothetical protein 




alpha-lytic proteinase precursor 




DNA-directed DNA polymerase 


major secreted protein PS1 protein 
precursor 










monophosphatase 
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length 
(aa) 


h- 


o 
m 

CO 






1023 
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CO 
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CO 


o 
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CO 
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T 




CO 

o 

CN 
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CO 
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Similarity 
(%) 


76.4 


74.9 






73.5 


57.7 


89.0 
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73.6 
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CN 

rC 
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co 


o 

CO 


CO 

o 
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26.7 




25.0 


27.0 










JCO. 
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Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv3195 


Mycobacterium tuberculosis 
H37Rv Rv3194 






Mycobacterium tuberculosis 
H37Rv Rv3193c 


Deinococcus radiodurans 
DR0840 


Hevea brasiliensis laticifer er1 


Aeropyrum pernix K1 APE0247 


Bacillus subtilis 168 yaaE 




Lysobacter enzymogenes ATCC 
29487 




Neurospora intermedia LaBelle- 
1b mitochondrion plasmid 


Corynebacterium glutamicum 
(Brevi bacterium flavum) ATCC 
17965 cspl 










Streptomyces alboniger pur3 
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X 
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LU 
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(A 
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CN 

in 
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CO 
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in 
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CO 
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825239 
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834633 
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835837 


838892 


839353 


840139 
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840437 


841517 


Initial 
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Function 


myo-inositol monophosphatase 


peptide chain release factor 2 


cell division ATP-binding protein 


hypothetical protein 


cell division protein 


small protein B (SSRA-binding 
protein) 


hypothetical protein 








vibriobactin utilization protein 


Fe-regulated protein 


hypothetical membrane protein 


ferric anguibadin-binding protein 
precursor 


ferrichrome ABC transporter 
(permease) 


ferrichrome ABC transporter 
(permease) 


ferrichrome ABC transporter (ATP- 
binding protein) 
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length 
(aa) 


ro 
CN 


Cn 
in 

CO 


CD 
CN 
CN 


CM 


o 

CO 
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r- 
CN 
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ro 


cn 


in 
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CO 
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CO 


CM 

ro 
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to 
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Similarity 
(%} 


59.3 


88.6 


91.2 


54.0 


CO 


75.9 


73.3 
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61.5 
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o 


43.6 


40.5 


in 

CO 


44.(3 








26.fi 
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CM 
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CO 


25 
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CJ 
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c 

o 
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35 


Homologous gene 


Streptomyces flavopersicus 
spcA 


Streptomyces coelicolor A3(2) 
prfB , i 


Mycobacterium tuberculosis 
H37RvRv3102cftsE 


Aeropyrum pernix K1 APE206T 


Mycobacterium tuberculosis 
H37RvRv3101cftsX 


Escherichia coli K12 smpB 


Escherichia coli K12 yeaO 








Vibrio choterae OGAWA 395 
viuB 


Staphylococcus aureus sirA 


Mycobacterium leprae 
MLCB1243.07 


Vibrio anguillarum 775 fatB 


Bacillus subtilis 168 yclN 


Bacillus subtilis 168 ycIO 


Bacillus subtilis 168 yclP 
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Function 


hypothetical protein 


phosphoserine transaminase 


acetyl-coenzyme A carboxylase 
carboxy transferase subunit beta 


hypothetical protein 


sodium/proline symporter 




hypothetical protein 


fatty-acid synthase 






homoserine O-acetyltransferase 






glutaredoxin 


dihydrofolate reductase 


thymidylate synthase 


ammonium transporter 


ATP dependent DNA helicase 


formamidopyrimidine-ONA 
glycosidase 
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length 
(aa) 
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Similari 
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Table 1 (continued) 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv0883c 


Bacillus circulans ATCC 21783 


Escherichia coli K12 accD 


Streptomyces coelicolor A3(2) 
SCI8.08c 


Pseudomonas fluorescens 




Mycobacterium tuberculosis 
H37Rv Rv2525c 


Corynebacterium 
ammoniagenes fas 






Leptospira meyeri metX 






Deinococcus radiodurans 
DR2085 


Mycobacterium avium folA 


Escherichia coli K12 thyA 


Escherichia coli K12 cysQ 


Streptomyces coelicolor A3(2) . 
SC7C7.16c 


Synechococcus elongatus 
naegeli mutM 
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Function 


hypothetical protein 

- 


| alkaline phosphatase 


integral membrane transporter 




glucose-6-phosphate isomease 


hypothetical protein 




hypothetical protein 


ATP-dependent helicase 


ABC transporter 


ABC transporter 




peptidase 


hypothetical protein 




S'-phosphoribosylglycinamide 
formyltransferase 


t 

a> 
o 

n a 
ro t/i 

•JD CO 

E £ 
o «2 
c £ 
U 

cn w 
O t» 
JQ -O 

o I 

Q- X 
(/> o 

0 -Q 

o. « 

1 ^ 

Co TT 


citrate lyase (subunit) 


15 




Matched 
length 
(aa) 


CO 
CN 


CO 

cn 


CO 

o 




r- 
lO 

LO 


to 

CJ) 




CO 

r- 


CO 
CO 

h- 


LO 
CO 
CO 


r-- 

CN 




CO 
CO 
CN 


CO 




en 

GO 


to 
CN 
CO 


CN 


20 




Similarity 
(%) 


r-. 

CD 

co 


71.9 


67.0 




77.0 


52.3 




85.9 


73.1 


48.6 


71.4 




73.3 


60.8 




86.2 


87.8 


100.0 






Identity 

('%) 


55.5 


33.8 


CO 
CO 




Tr- 
io 


24.6 




o 

cn 
to 


46.1 


CO 

CN 


CO 
CO 




CO 

CO 

TT 


31.1 




64.6 


LO 

tt 

r- 


10010 


25 
30 
35 


Table 1 (continued) 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv0870c 


Lactococcus lactts MG1363 apt 


Streptomyces coelicolor A3(2) 
SCI28.06C 




Escherichia coli JM1 01 pgi 


Mycobacterium tuberculosis 1 
H37Rv Rv0336 




Mycobacterium tuberculosis 
H37Rv Rv0948c 


Bacillus stearothermophilus > 
NCA 1503 per A 


Streptomyces coelicolor A3(2) 
SCE25.30 


Bacillus subtilts 168 yvrO 




Mycobacterium tuberculosis 
H37Rv Rv0950c 


Mycobacterium tuberculosis 
H37Rv Rv0955 




Corynebacterium 
ammoniagenes purN 


Corynebacterium 
ammoniagenes purH 


Corynebacterium glutamicum 
ATCC 13032 citE 
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Function 


repressor of the high-affinity (methyl) 
ammonium uptake system 


hypothetical protein 




30S ribosomal protein S18 


30S ribosomal protein S14 


SOS ribosomal protein L33 


SOS ribosomal protein L28 


transporter (sulfate transporter) 


Zn/Co transport repressor 


SOS ribosomal protein L31 


SOS ribosomal protein L32 




copper-inducible two-component 
regulator 


two-component system sensor 


proteinase DO precursor 


molybdopterin biosynthesis cnxl 
protein (molybdenum cofactor 
biosynthesis enzyme cnxl) 




large-conductance 
mechanosensitive channel 


hypothetical protein 


5-tormyttetrahydrofolate cyclo-ligase 
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length 
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Similarity 
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Homologous gene 


Corynebacterium glutamicum 
ATCC 13032 amtR 


Corynebacterium glutamicum 
ATCC 13032 yjcC 




Cyanophora paradoxa rps18 


Escherichia coli K12 rpsN 


Escherichia coli K12 rpmG 


Escherichia coti K12 rpmB 


Bacillus subtilis 168 yvdB 


Staphylococcus aureus zntR, 


Haemophilus ducreyi rpmE 


Streptomyces coelicolor A3(2) 
SCF51A.14 




Pseudomonas syringae copR 


Escherichia coli K12 baeS n 


Escherichia coli K12 htrA 


Arabidopsis thaliana CV cnxl 




Mycobacterium tuberculosis ,, 
H37Rv Rv0985c mscL 


Mycobacterium tuberculosis i. 
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hypothetical membrane protein 
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hypothetical protein 


transcription activator of L-rhamnose 
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phosphtnothricin resistance protin 


hypothetical protein 




hypothetical protein 


lactam utilization protein 


hypothetical membrane protein 






transcriptional regulator 




fumarate hydratase precursor 
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oxydoreductase 
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dibenzothiophene desulfurization 
enzyme A 


dibenzothiophene desulfurization 
enzyme C (DBT sulfur dioxygenase) 


dibenzothiophene desulfurization 
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FMNH2-dependent aliphatic 
sulfonate monooxygenase 


glycerol metabolism 


hypothetical protein 


hypothetical protein 




transmembrane efflux protein 


exodeoxyribonudease small subunit 


exodeoxyribonudease large subunit 


penicillin tolerance 
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Function 


cysteine desulphurase 


nicotinate-nucleotide 
pyrophosphorylase 


quinolinate synthetase A 


DNA hydrolase 


hypothetical membrane protein 


hypothetical protein 


hypothetical protein 


lipoate-protein ligase A 


alkylphosphonate uptake protein 
and C-P lyase activity 


transmembrane transport protein or 
4-hydroxybenzoate transporter 


p-hydroxybenzoate hydroxylase (4- 
hydroxybenzoate 3- 
monooxygenase) 


hypothetical membrane protein 


ABC transporter ATP-binding protein 


hypothetical membrane protein 




Ca2+/H+ antiporter ChaA 


hypothetical protein 


hypothetical membrane protein 
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excinuclease ABC subunit A 


thioredoxin peroxidase 






hypothetical membrane protein 


oxidoreductase or thiamin 
biosynthesis protein 
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modifier) 
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Function 


hypothetical protein 


ATPase 


hypothetical protein 


hypothetical protein 
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2-oxoglutarate dehydrogenase 


ABC transporter or multidrug 
resistance protein 2 (P-glycoprotein 
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hypothetical protein 


shikimate dehydrogenase 
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oxidase subunit II 


cytochrome bd-type menaquinol 
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DEAD box ATP-dependent RNA 
helicase 


bacterial regulatory protein, tetR 
family 


pentachlorophenol 4- 
monooxygenase 


maleylacetate reductase 


catechol 1,2-dioxygenase 




hypothetical protein 


[transcriptional regulator 
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esterase or lipase 
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Klebsiella pneumoniae CG43 
DEAD box ATP-dependent RNA 
helicase deaD 


Mycobacterium leprae , 
B1308_C2_181 


Sphingomonas flava pcpB 


Pseudomonas sp. B13 cIcE 


Acinetobacter calcoaceticus 
catA 




Mycobacterium tuberculosis 
H37Rv Rv2972c 


Saccharomyces cerevisiae 
SNF2 




Streptomyces coelicolor A3(2) 
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Mycobacterium tuberculosis 
H37RvRv1277 


Mycobacterium tuberculosis 
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Petroleum-degrading bacterium 
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short-chain fatty acids transporter 


regulatory protein 






fumarate (and nitrate) reduction 
regulatory protein 


mercuric transort protein periplasmic 
component precursor 


zinc-transporting ATPase Zn(ll)- 
translocating P-type ATPase 


GTP pyrophosphokinase (ATP:GTP 
3*-pyrophosphotransferase) {ppGpp 
synthetase I ) 


tripeptidyl aminopeptidase 






homoserine dehydrogenase 






nitrate reductase gamma chain 


nitrate reductase delta chain 


nitrate reductase beta chain 
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nitrate reductase alpha chain 


nitrate extrusion protein 
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1307462 


1310369 


1310435 
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1313115 


1314118 


1314470 


1316083 
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1300552 
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1303123 
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Function 




glucose-resistance amylase 
regulator (catabolite control protein) 


ripose transport ATP-binding protein 


high affinity ribose transport protein 


periplasmic ribose-binding protein 


high affinity ribose transport protein 


hypothetical protein 


iron-siderophore binding lipoprotein 


Na-dependent bile acid transporter 


RNA-dependent amidotransferase B 


putative F420-dependent NADH 
reductase 


hypothetical protein 


hypothetical protein 


hypothetical membrane protein 




dihydroxy-actd dehydratase 


hypothetical protein 
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CN 

CO 

r— 


76.9 


r— 


68.4 
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60.2 


61.9 


71.8 


61.1 


66.9 
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CO 


52.6 
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CO 
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ro 

CO 
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Homologous gene 




Bacillus megaterium ccpA 


Escherichia coli K12 rbsA 


Escherichia coli K12 MG1655 
rbsC i; 


Escherichia coli K12 MG1655 
rbsB 


Escherichia coli K12 MG1655 ii 
rbsD 


Saccharomyces cerevisiae i' 
YIR042c 


Streptomyces coelicolor 
SCF34.13C 


Rattus norvegicus (Rat) NTCI 


Staphylococcus aureus WHU 29 
ratB 


Methanococcus jannaschii 
MJ1501 f4re 


Escherichia coli K12 yqjG ! 


Mycobacterium tuberculosis 
H37Rv Rv2972c 


Mycobacterium tuberculosis 
H37Rv Rv3005c 




Corynebacterium glutamicum 
ATCC 13032 HvD 


Mycobacterium tuberculosis 
H37Rv Rv3004 
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CN 
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1315325 
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V 

ro 
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1319976 
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1323406 


1324537 


1326256 


1327049 


1329891 
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1333008 


1333188 


1333442 


1335412 
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Initial 
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1315954 


1316338 


1317434 


1319005 


1320001 


1320952 


1321476 


1322393 


1323533 


1324778 


1326378 


1330967 


1331102 


1331953 


1333424 
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actinorhodin polyketide dimerase 


cobalt-zinc-cadimium resistance 
protein 












3tein 
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Function 


hypothetical membrane pi 


hypothetical protein 




nitrate transport ATP-bind 


maltose/maltodextrin trans 
binding protein 


nitrate transporter protein 










hypothetical protein 




D-3-phosphoglycerate 
dehydrogenase 


hypothetical serine-rich pr< 






hypothetical protein 
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length 
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Table 1 (continued) 


Homologous gene 


Corynebacterium glutamicum 
ATCC 13032 yilV 


Sulfolobus solfataricus 




Synechococcus sp. nrtD 


Enterobacter aerogenes 
(Aerobacter aerogenes) malK 


Anabaena sp. strain PCC 7120 
nrtA 






Streptomyces coelicolor 


Ralstonia eutropha czcD 






Methanococcus jannaschii 




Brevibacterium flavum serA 


Schizosaccharomyces pombe 
SPAC11G7.01 






Rhodobacter capsulatus strain 
SB 1003 \ 
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1354540 
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Function 






lipoprotein 




glycogen phosphorylase 






hypothetical protein 


hypothetical membrane protein 




guanosine 3',5'-bis{diphosphate) 3- 
pyrophosphatase 


acetate repressor protein 


3-isopropylmalate dehydratase large 
subunit 


3-isopropylmalate dehydratase small 
subunit 




mutator mutT protein ((7,8-dihydro- 
8-oxoguanine-triphosphatase)(8- 
oxo-dGTPase)(dGTP 
pyrophosphohydrolase) 




NAD(P)H-dependent 
dihydroxyacetone phosphate 
reductase 


D-alanine-D-alanine ligase 
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length 
(a.a) 
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cn 
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CD 
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CO 


to 

CN 


CO 
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*r 
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Similarity 
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52.8 
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60.7 


87.5 
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26.1 


68.1 
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Homologous gene 






Chlamydia trachomatis 




Rattus norvegicus (Rat) 






Bacillus subtilis yrkH 
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Escherichia coli K12 spoT ! 


Escherichia coli K12 icIR ![ 


Actinoplanes teichomyceticus 
Ieu2 11 


Salmonella typhimurium 




Mycobacterium tuberculosis 
H37Rv MLCB637.35C 




Bacillus subtilis gpdA 


Escherichia coli K12 MG1655 : 
ddIA 
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1385125 
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Function 




thiamin-phosphate kinase 


uracil-DNA glycosylase precursor 


hypothetical protein 


CU 

LO 

CO 
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ai 
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D 

"c 
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C 

a> 
a. 
cu 

T3 

CL 
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polypeptides predicted to be useful 
antigens for vaccines and 
diagnostics 


biotin carboxyl carrier protein 


methylase 


lipopolysaccharide core biosynthesis 
protein 




Neisserial polypeptides predicted to 
be useful antigens for vaccines and 
diagnostics 


ABC transporter or glutamine ABC 
transporter, ATP-binding protein 


nopaline transport protein 


glutamine-binding protein precursor 




hypothetical membrane protein 




phage integrase 
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length 
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Homologous gene 




Escherichia coli K12 thiL 


Mus musculus ung 


Mycoplasma genitalium (SGC3) 
MG369 1 


Escherichia coli K12 recG 


Neisseria meningitidis 


I Propionibacterium freudenreichii 
| subsp. Shermanii 


Escherichia coli K12 yhhF 


Escherichia coli K12 MG1655 , 
kdtB 




Neisseria gonorrhoeae 1 


Bacillus stearothermophilus 
glnQ 


Agrobacterium tumefaciens 
nocM 


Escherichia coli K12 MG1655 
glnH \ 




Methanobacterium 
thermoautotrophicum MTH465 




Bacteriophage L54a vinT 
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Function 


hypothetical protein 


30S ribosomal protein S1 




hypothetical protein 










inosine-uridine preferring nucleoside 
hypolase (purine nucleosidase) 


aniseptic resistance protein 


ribose kinase 


criptic asc operon repressor, 
ranscription regulator 




excinuclease ABC subunit B 


hypothetical protein 


hypothetical protein 


hypothetical protein 




hypothetical protein 


hypothetical protein 


hydrolase 
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length 
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Homologous gene 


Streptomyces coelicolor ' 
SCH5.13yafE 


Escherichia coli K12 rpsA 




Brevibacterium lactofermentum 
ATCC 13869 yacE 










Crithidia fasciculata iunH 


Staphylococcus aureus 


Escherichia coli K12 rbsK j; 


Escherichia coli K1 2 ascG 




. Streptococcus pneumoniae 
plasmid pSB470 uvrB 


Methanococcus jannaschii 
MJ0531 


Escherichia coli K12ytfH 


Escherichia coli K12 ytfG 




Bacillus subtilis yvgS 


Streptomyces coelicolor A3(2) i 
SC9H1 1.26c || 


Escherichia coli K12 ycbL 


db Match 


sp:YAFE_ECOLI 


i 

O 
u 

ID 

1 

CO 

a: 

CL 

to 




LU 

a: 

CO 

i 

LU 
O 
< 

i± 










< 

LL 

cr 

3 

Cl 
to 


sp:QACA_STAAU , 


sp:RBSK_ECOLI 


sp:ASCG_ECOLI 




a 
cr 
y— 

if) 

*' 

at 
> 

3 

CL 

cn 


< 
— j 
h- 

LU 

1 

CO 

m 
> 

CL 
CO 


sp:YTFH_ECOU | 


_i 

O 

o 

LU 

o 1 

U. 
h- 
>- 

CL 
CO 




pir:H70040 


gp:SC9H11_26 


sp:YCBL_ECOLI 




T 

LO 
CO 


1458 


1476 


o 
o 

CO 


1098 


CN 

CO 
LO 


to 

CN 


r-. 

LO 

O) 


CO 
CO 

cn 


cn 

T 


CN 

cn 


1038 


CO 

cn 


2097 




CO 
CO 


CO 
■sr 
co 


CO 

to 


2349 


CN 
OO 


o 
o 
to 


Terminal 
(nt) 


1420071 


1422556 


to 
cn 
o 

OJ 


1425878 


1427354 


1427376 


1427804 


1429246 


1428224 


1429194 


1430659 


1431575 


1433547 . 


1436201 


1436775 


1436869 


1438201 


1440026 


1438212 


LO 

r- 
to 
o 

T 


1441793 


Initial 
(nt) 


1420724 


1421099, 


1422571 j 


1425279 


1426257 


1427957 | 


1428049 


1428290 


1429159 


CN 

CO 
O 
CO 


1431579 


1432612 


| 1432750 


1434105 


1436335 


cn 

CN 

r-- 
co 


1437356 


1439343 


1440560 


1441586 


1442392 


SEQ 
NO. 
(a.a.) 


4999 


5000 


s 


5002 


5003 


5004 


5005 


5006 


5007 


5008 


|5009 


5010 


| 5011 


5012 


5013 


5014 


5015 


5016 


5017 


5018 


5019 | 


SEQ 
NO. 
(DNA) 


cn 
cr> 


o 
o 

LO 


1 


CN 

o 

LO 


CO 

o 
to 


1504 


1505 


9091 


r- 
o 
to 


CO 
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m 


1509 


o 


| 1511 


1512 


1513 


t 

LO 


1515 


1516 [ 


tn 


CO 
r- 

m 


cn 
in 
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Function 


excinuclease ABC subunit A 


hypothetical protein 1246 (uvrA 
region) 


hypothetical protein 1246 (uvrA 
region) 






translation initiation factor IF-3 


50S ribosomal protein L35 


SOS ribosomal protein L20 






sn-glycerol-3-phosphate transport 
system permease protein 


sn-glycerol-3-phosphate transport 
system protein 


sn-glycerol-3-phosphate transport 
system permease proein 


sn-glycerol-3-phosphate transport 
ATP-binding protein 


hypothetical protein 


glycerophosphoryl diester 
phosphodiesterase 


Jk 

O 

H> to 
c JO 

S 1 

< jc 
Z "55 
£ E 


pheny/a/anyMRNA synthetase alpha 
chain 


Matched 
length 
(a.a) 


CN 
LO 

cn 


O 
O 


CN 
XT 
v- 






cn 
r-- 


O 
to 


r- 






CN 
cn 
CN 


o 

CN 


CO 
CO 
XT 


CO 

cn 

CO 


XT 


XT 
xT 
CN 


CO 

in 










































(%) 


80.6 


57.0 


47.0 






78.2 


76.7 


92.7 






71.6 


70.4 


57.6 


71.3 


56.0 


50.0 


71.2 






56.2 


40.0 


31,0 






in 

CN 

in 


41.7 


o 

LO 






33.2 


! 

; 33.3 


-to 

CO 
CN 


xT 
XT 


47.6 


ON 
CD 
CN 


XT 

CO 




Homologous gene 


Escherichia coli K12 uvrA 


Micrococcus luteus 


Micrococcus luteus 






Rhodobacter sphaeroides infC 


Mycoplasma fermentans j 


Pseudomonas syringae pv. 
syringae 






Escherichia coli K12MG1655 
ugpA | 


Escherichia coli K12 MG1655 
upgE 


Escherichia coli K1 2 MG1 655 
ugpB 


Escherichia coli K12MG1655 , 


Aeropyrum pernix K1 APE0042; 


Bacillus subtilis glpQ I; 


Escherichia coli K12 MG1655 | 
trmH 


; i 

Bacillus subtilis 168 syfA 


db Match 


|sp:UVRA_ECOLI 


to 

o 

XT 
O 

O 
—i 

or 

o_ 


PIR:JQ0406 






sp:IF3_RHOSH 


UJ 

u, 

o 

LO 1 
CO 

_l 
or 

ci. 
m 


sp:RL20_PSESY 






_j 
O 

o 

UJ 

bl 

V) 


sp:UGPE_ECOLI 


_j 
o 
o 

UJ 

I 

CD 
CL 

CD 
Z> 

Ql 
co 


sp:UGPC_ECOLI 


to 
in 
r-» 

CN 
UJ 

or 

CL 


sp:GLPQ_BACSU 


sp:TRMH_ECOLI 


sp:SYFA_BACSU 


ORF 

t°P) 


2847 


to 
o 

CO 


o 
in 

XT 




2124 


to 

LO 


CN 

cn 


CO 

n 


CN 
CN 
CO 


r- 
CO 

m 


CO 

o 
cn 


XT 

CO 
CO 


1314 


1224 


cn 

XT 
CN 


^ 


xT 

cn 
m 


1020 


Terminal 
(nt) 


| 1445333 


1443810 


xT 
XT 

o> 

xr 
XT 
XT 


1446874 


1445323 


j 1448358 


1448581 


1449025 


1449119 


1450692 


1451820 


1452653 


1454071 


1455338 


1454102 


1455350 


1456948 


1458066 


Initial 
(nt) 


r- 
CO 
"*T 
CN 
XT 
XT 


1444115 


1445393 | 


1446158 


1447446 


CN 

cn 
r- 

XT 
XT 


: 1448390 


m 

xT 

to 

CO 

XT 
XT 


1449940 


1450126 


CO 

o 
o 
in 

XT 


1451820 


1452758 


LO 

XT 
LO 

XT j 


1454350 


1456066 


1456355 


1457047 


SEQ 
NO. 
(a.a.) 


5020 


5021 


5022 


5023 


XT 
CN 
O 

in 


5025 


5026 


5027 


5028 


5029 


5030 


5031 


5032 


5033 


5034 


5035 


5036 


5037 


SEQ 
NO. 
(DNA) 


1520 


1521 


1522 


1523 


1524 


1525 j 


M526 


1527 


1528 


1529 


1530 


1531 


1532 


1533 


1534 


1535 


1536 


^ I 

5 | 
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5 
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Function 


phenylalanyl-tRNA synthetase beta 
chain 




esterase 


macrolide 3-O-acyltransferase 




N-acetylglutamate-5-semialdehyde 
dehydrogenase 


glutamate N-acetyltransferase 


acetylornithine aminotransferase 


argininosuccinate synthetase 




argininosuccinate lyase 








hypothetical protein 


tyrosyl-tRNA synthase (tyrosine— 
tRNA ligase) 


hypothetical protein 




hypothetical protein 


15 




Matched 
length 
(a.a) 


CO 
CO 




CO 
CO 
CO 


CO 
CN 
T 




co 


CO 
CO 
CO 


cn 

CO 


O 




CO 

r- 
■c 








OS 




cn 




CN 
<CT 














CO 






r- 


CN 


tn 




o 








o 


CO 






O 
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tn 


CD 
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cn 


cn 
cn 


cn 
cn 


cn 
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o 
cn 








CN 
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cn 
r- 


to 




tri 
r- 






>* 

.-tr 


to 




tn 


o 




CO 


uo 


o 


tn 




CO 












cr> 










CD 


-ci - 




-CD- 

CN 


o 

CO 




CO 

cn 


cn 
cn 


o> 
cn 


cn 




CO 

co 








CO 


CO 

rr 


to 

CN 




r-- 


25 












































30 
35 


Table 1 (continued) 


Homologous gene 


Escherichia coli K12MG1655 
syfB 




Streptomyces scabies estA 


Streptomyces mycarofaciens 
mdmB 




Corynebacterium giutamicum 
AS019argC 


Corynebacterium giutamicum 
ATCC 13032 arg J 


Corynebacterium giutamicum 
ATCC 13032 argD 


Corynebacterium giutamicum 
AS019argG 




Corynebacterium giutamicum 
AS019argH 








Escherichia coli K12 ycaR 


Bacillus subtilts syyl 


Methanococcus jannaschii 
MJ0531 




Chlamydia muridarum Nigg 
TC0129 


40 




db Match 


_j 
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o 

UJ 

LL 
>- 
CO 




o 

CO 

cn 
f— 

CO 

<' 

co 

UJ 


>- 
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i- 

00 
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i 

CN 
CN 
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o 
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< 
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ARGD_CORGL 
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o 
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CO 
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CO 

r- 

CO 

o 
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_j 
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o 

LU 

I 
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< 
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i- 
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CO 
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r- 
co 
r- 

CO 
LL 

DC 
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CL 

tn 




CL 
CO 


CL 
tO 




CL 

cn 


CL 

tn 


Cl 
to 


CL 

tn 




CL 
CO 








to 


CL 

tn 


Cl 

CO 








81 


2484 




CN 

h- 
O) 


1383 


CN 
O 


1041 


1164 


1173 


1203 


1209 


1431 


1143 


1575 


CN 
CO 


r*- 


1260 


in 
to 


o 
cn 

CO 


i - 

i 


45 




Terminal 
(nt) 


1460616 


1458196 


1462128 


1463516 


1463934 


1465123 


1466373 


CO 

LO 
CO 

to 
"J- 


1471413 


1470154 


1472907 


1474119 


1475693 


1476294 


1476519 


1477809 


1477929 


1478503 


1483335 


50 




Initial 
(nt) 


1458133 


1458966 


1461157 


1462134 


1463533 


1464083 


o 

CN 

tn 
to 


1467376 


1470211 ! 


1471362 


r- 
r- 


1472977 


1474119, 


1475683 


1476343 


1476550 


1478393 


1478892 


1483475 






SEQ 

NO. 
(a.a.) 


5038 


5039 


5040 


o 
j tn 


5042 


5043 


o 
m 


5045 


5046 


5047 


5048 


CO 
XT 

o 
m 


5050 


5051 


5052 


5053 


5054 


5055 


5056 


55 




SEQ 
NO 
(DNA) 


I CO 
CO 
I £ 


cn 

CO 

in 


o 

m 


I ^ 


CN 

m 


co 
m 


1544 


in 
m 


M~546 


m 


CO 

m 


cn 
m 


o 
m 
m 


tn 
m 


!cn 

: in 
• in 


1553 


m 
tn 


m 
in 

in 


1556 
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Function 


hypothetical protein 


translation initiation factor IF-2 


hypothetical protein 




hypothetical protein 


hypothetical protein 


ONA repair protein 


hypothetical protein 


hypothetical protein 


CTP synthase (UTP-ammonia 
ligase) 


hypothetical protein 


tyrosine recombinase 


tyrosin resistance ATP-binding 
protein 


chromosome partitioning protein or 
ATPase involved in active 
partitioning of diverse bacterial 
plasmids 


hypothetical protein 




thiosulfate sulfurtransferase 


hypothetical protein 


ribosomal large subunit 
pseudouridine synthase B 


Matched 
length 
(a.a) 


T 

co 


CN 
GO 


CO 




o 

CO 
CN 


LO 
CN 
CN 


r- 

LO 


T 

cn 

CO 


CO 
CO 


CD 

LO 


LO 


o 

o 

CO 


LO 
LO 


CO 
LO 
CN 


LO 
CN 




o 

CN 


CN 


cn 

CN 
CN 


Similarity 


66.0 


o 
f- 

CO 


60.1 




69.6 


31.6 


63.4 


73.1 


68.1 


76.7 


71.3 


71.7 


59.7 


73.6 


64.5 




O 
f-- 

to 


65.7 


72.5 | 


Identity 
{%) 


61.0 


CO 
CO 


29.6 




38.5 


31.6 


31.4 


cn 


30.4 


o 

LO 


36.3 


39.7 


30.5 


to 

T 


28L3 




to 

LO 
CO 


CO 
CO 


LO 


Homologous gene 


Chlamydia pneumoniae 


Borrelia burgdorferi IF2 


Bacillus subtilis yzgD 




Bacillus subtilis yqxC 


Mycobacterium tuberculosis 
H37Rv Rv1695 


Escherichia coli K12 recN 


Mycobacterium tuberculosis 
H37Rv Rv1697 


Mycobacterium tuberculosis 
H37RvRv1698 


Escherichia coli K12 pyrG 


Bacillus subtilis yqkG 


Staphylococcus aureus xerD 


Streptomyces fradiae tlrC 


Caulobacter crescentus parA 


Bacillus subtilis ypuG 




Datisca glomerata tst 


Bacillus subtilis ypuH 


Bacillus subtilis rluB 


db Match 


GSP:Y35814 


sp:IF2_BORBU 


sp:YZGD_BACSU 




sp:YQXC_BACSU 


LU 
< 
X 

I 

CO 
~i 

LL 
> 
CL 
CO 


sp:RECN_ECOLI 


CN 
O 

LO 

o 
f-. 
X 

'5. 


pir:A70503 


_i 
O 
a 

LU 

I 

o 

> 

a 

CL 
CO 


3 
CO 

o 

s' 

O 
>■ 

CL 
CO 


gp:AF093548_1 


sp:TLRC_STRFR 


T 

O 
CO 
I s -- 
CO 
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O 
Cl 

CJ) 


sp:YPUG_BACSU 




to' 
to 

o 

LL 
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cn 


3 
CO 

o 
< 

CO 

t 

X 
3 

o_ 
>- 

CL 

to 


sp:RLUB_BACSU 


si 


CO 
CN 


1353 


T 

co 
cn 


CN 
CO 


cn 

CO 


CO 

r»- 

CO 


1779 


1191 


CO 

to 
cn 


1662 


r- 
ir> 
to 


CN 
CD 


1530 


CO 
CO 

r- 


LO 

to 


CO 
LO 


to 

CO 


CO 
LO 


CO 
LO 

r- 


Terminal 
(nt) 


CN 
f- 

co 

oo 


1486027 


1487025 


1487193 


1488056 


1489018 


1490881 


1492134 


1493109 


1495174 


1495861 


1496772 


1496795 


1499645 


1500695 


1500911 


1502576 


1503176 


1504238 


Initial 
(nt) 


1483996 


1484675 


1486042 


1487032 


1487238 


to 

CO 
CO 


1489103 


1490944 


1492147 


1493513 


1495205' 


1495861 


1498324 i 


1498863 


1499931 j 


1501471 


1501710 


1502634 


1503483 


pi •—if"- 
co ^ £.! ^ 


5058 


5059 


5060] 


5061 


5062 


5063 


5064 
5065 


5066 


5067 I 


5068 ; 


5069 


I 


5071 j 


5072 I 


5073 


5074 j 


5075 


SEQ 
NO. 
(DNA) 

1557 


CO 
LO 
LO 


cn 

LT) 

wo 


o 

CO 
LO 


1561 1 


CN 

to 

LO 


CO 

to 

LO 


1564 
1565 


to 
to 

LO 


r>- 
ID 
LO 


CO 
CD 
lO 


cn 
to 

LO 


o 

U"> 


LO 


CN 
I s - 
LO 


CO 

r-. 

LO 


1574 
1575 
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20 



35 



40 



45 



50 



55 



Function 


cyttdylate kinase 


GTP binding protein 






methyltransferase 


ABC transporter 


ABC transporter 




hypothetical membrane protein 




Na+/H+ antiporter 






hypothetical protein 


2-hydroxy-6-oxohepta-2,4-dienoate 
hydrolase 


preprotein translocase SecA subunit 


signal transduction protein 


hypothetical protein 


hypothetical protein 


Matched 
length 
(a.a) 


o 

CM 
CM 


tn 

CO 






CN 
CO 
CN 


cn 
cn 


CN 

o 
to 




r-- 
m 

CN 










o 

CO 


o 

CM 


m 
o 

co 


CM 
CO 


CO 
CM 


CO 
CO 


Similarity 
(%) 


73.6 


74.0 






67.2 


60.1 


56.3 




73.2 




61.5 






57.7 

I __ 


63.8 


61 7 


93.2 


74.4 


63.2 


>* 

Q) 
■D 


38.6 


CO 






36.2 


29.7 


CN 

CO 




cn" 

CO 




25.7 






36-9 


! 25.2 


35.2 


75.3 | 




30.8 


Homologous gene 


Bacillus subtilis cmk 


Bacillus subtilis yphC 






Mycobacterium tuberculosis ■ 
Rv3342 


Corynebacterium striatum M82B 
tetA 


Corynebacterium striatum M82B 
tetB 




Escherichia coli K12 ygiE 




I Bacillus subtilis ATCC 9372 
[nhaG 






Escherichia coli K12 o249#9 
I ych J 


Archaeoglobus fulgidus AF0675 


Bacillus subtilis secA 


Mycobacterium smegmatis garA 


Mycobacterium tuberculosis ■ 
H37RvRv1828 


Mycobacterium tuberculosis 
H37Rv Rv1828 


db Match 


sp.KCY_BACSU 


sp:YPHC_BACSU 






3 
i- 
O 

> 

I 

CN 

X 
> 

cL 
cn 


prf:2513302B 


prf:2513302A 




sp:YGIE_ECOLI 




in 
m 
in 
cn 

CM 

o 

CD 
< 

a. 

CD 






sp:YCHJ_ECOLI 


pir:C69334 


sp:SECA_BACSU 


CM 

1 

00 
CO 

< 

CL 

cn 


sp;YODF_MYCTU 


3 
h- 
O 
>- 

1 

LU 
Q 

o 

>■_ 

CL 

tn 


Si 


o 

CD 
CO 


1557 


to 
to 
to 


CO 
CD 


CO 
CO 


1554 


1767 


m 

CN 
CO 


cn 
co 


CD 
GO 

T— 


1548 


CO 
CO 


o 

CN 


in 
r*- 

CO 


1164 


2289 


CD 
CN 


to 
to 
h- 


CO 
CO 

to 


Terminal 
{nt) 


in 
cn 
O 

un 


1506573 


1506662 


1507405 


1507917 


1510366 


1512132 


1510843 


1512977 


1514693 


1512980 


| 1514974 | 


1515815 


1515408 


1515799 


1519458 


1520029 


1520945 


1521589 


Initial 
(nt) 


1504256 ; 


1505017 


1507327 | 


1507902 


I 1508729 


1508813 


1510366 


1511667 


1512189 


1514505 


1514527 


1515159 


1515396 


i 1515782 


1516962 


1517170 


1519601 


1520190 


1520957 


SEQ 
NO. 
(a.a.) 


5076 


5077 


5078 


5079 j 


5080 


5081 


5082 


5083 


CO 

o 
m 


5085 


5086 


5087 


5088 


5089 


5090 


5091 


5092 


5093 


5094 


SEQ 
NO. 
(DNA) 


to 

r-. 
m 


r- 
in 


CO 

to 


M579 j 


o 

CO 

m 


CO 

in 


1582 


fO 
CO 

m 


CO 

m 


m 

CO 

tn 


1586 


00 

m 


CO 
CO 

in 


cn 

CO 

in 


1590 
1591 


CN 

cn 
in 


CO 

cn 
in 


CD 

in 
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30 r- 
Oi 



35 



40 



45 



50 



55 



Function 


hypothetical protein 










hemolysin 


hemolysin 




DEAD box RNA heticase 


ABC transporter ATP-binding protein 


6-phosphogluconate dehydrogenase 


thioesterase 




nodulation ATP-binding protein I 


hypothetical membrane protein 


transcriptional regulator 


phosphonates transport system 
permease protein 


phosphonates transport system 
permease protein 


phosphonates transport ATP-binding 
protein 






Matched 
length 
(aa) 


CO 

r- 










CN 
CO 


LO 
(O 




CO 


m 

CN 


CN 
CO 

<<r 


CN 




m 

CO 
CN 


CN 
CO 
CM 


CM 


CO 
CN 


CO 
CO 
CN 


o 

LO 
CN 


















































Similari 
(%) 


84.3 










69.0 


65.5 




69.5 


66.1 


99.2 


67.8 




68.1 


76.3 


63.9 


63.4 


62.3 


72.0 






Identity 
{%) 


71.4 










CD 

CO" 
CO 


CO 




CN 
T 


34)3 


0[66 


r- 
CO 
CO 




CD 

CO 
CO 


CO 


to 

CN 


29.9 


r-' 

CN 








Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv1828 










Bacillus subtilis yhdP 


Bacillus subtilis yhdT 




Thermus thermophilus herA 


Mycobacterium tuberculosis 
H37RvRv1348 


Brevibacterium flavum 


Mycobacterium tuberculosis i' 
H37RvRv1B47 




Rhizobium sp. N33 nodi 


Mycobacterium tuberculosis 
H37RvRv1686c 


Escherichia coli K12 yfhH i( 


Escherichia coli K12 phnE 


Escherichia coli K12 phnE 


Escherichia coli K12 phnC 






db Match 


sp:YODE_MYCTU 










ZD 

go 
u 
< 

CD 

a' 

o 

X 
> 

CL 

to 


sp:YHDT_BACSU 




gp:TTHERAGEN_1 j 


sp:YD48_MYCTU 


CO 

to 

r>- 
CN 

cL 

<A 

CO 


t 

to 

CO 

o 
O 
a. 




sp:NODI_RHIS3 


pir:E70501 


sp:YFHH_ECOLI 


_t 
O 
o 

LU 

I 

LU 

X 
Q_ 

CL 

m 


sp:PHNE_ECOLI 


sp:PHNC_ECOLI 








co 
r-- 
tn 


o 
in 


1449 


o 
o 
to 


o 

CO 
CO 


1062 


1380 


CO 
CN 


■*T 
"<T 
CO 


in 

CO 

r-~ 


to 


CN 

to 


LO 

to 




t 

r- 


CO 
CO 


to 

CO 


rr 

o 

CO 


o 

CO 


o 

1 — 
CN 


1050 


Terminal 
(nt) 


1522343 


CM 
CO 
T 
CM 
CM 

tn 


1523052 


1525973 


1524568 


1525473 


1526534 


1528186 


1527987 


1530220 


1530341 


1532394 


1532996 


1533781 


1534521 


1534529 


1535382 


1536227 


1537030 


1538968 


1537870 


Initial 
(nt) 


1521771 


1522941 


1524500 


1525374 


1525497 


1526534; 


1527913; 


1527968 


1529330 | 


1529486 


1531816 


1531933 j 


1532322 


1533041 I 


1533781 


1535401 


1536227 


1537030 


1537833 


1538759 


1538919 




5095 


5096 


5097 


5098 


5099 


5100 


5101 


5102 


5103 


5104 


5105 


5106 


5107 


5108 


5109 


5110 


U"> 


5112 


5113 


5114 


5115 


SEQ 
NO. 


1595 


1596 


1597 


1598 


1599 


j 1600 


1601 


1602 


1603 


1604 


| 1605 


1606 


1607 


1608 


1609 


| 1610 


1611 


1612 


1613 


1614 


1615 



120 
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Function 




phosphomethylpyrinnidine kinase 


hydoxyethylthiazole kinase 


cyclopropane-fatty-acyl-phospholipid 
synthase 


sugar transporter or 4-methyl-o- 
phthalate/phthalate permease 


purine phosphoribosyttransferase 


hypothetical protein 


arsenic oxyanion-translocation pump 
membrane subunit 




hypothetical protein 


sulfate permease 


hypothetical protein 










hypothetical protein 


dolichol phosphate mannose 
synthase 


apolipoprotein N-acyltransferase 




secretory lipase 


15 


Matched 
length 
(a.a) 




CN 

to 

CN 


CD 
CN 


m 


CO 
CO 


to 
m 


CO 

o 

CN 


CO 
CO 




CN 
CN 
CN 


o> 
to 


r- 
co 










o 


CN 


r-- 

CN 

m 




CN 

cn 

CO 


20 


Similarity 
(%) 




70.2 


77.5 


55.0 


66.9 


o 

O) 

in 


68.5 


54.6 




83.8 j 


83.6 


50.0 | 










87.3 


71.0 


55.6 




55.6 




Identity 




JO 

r--' 


SP 
CO 


_<o 

CO 
CN 


in 

CN 
CO 


JO 

CO 
CO 


CO 

oS 

CO 


CO 
CO 
CN 




62.2 


_co_ 
to 


39.0 










71.3 


39.2 


tri 

CN 




23.7 


25 

C 
C 

o 

30 ^ 

CD 

35 


Homologous gene 




Salmonella typhimurium thiD ■ 


Salmonella typhimurium LT2 
thiM 


Mycobacterium tuberculosis 
H37Rv ufaAl 


Burkhofderia cepacia Pc701 
mopB ,; 


Thermus fiavus AT-62 gpt 


Escherichia coli K12 yebN 


Sinorh zobium sp. As4 arsB 




Streptomyces coelicolor A3(2) 
SCI7.33 


Pseudomonas sp. R9 ORFA 


Pseudomonas sp. R9 ORFG 










Mycobacterium tuberculosis !l 
H37Rv Rv2050 


' Schizosaccharomyces pombe 
dpml 


i Escherichia coli K12lnt lf 




Candida albicans Iip1 


40 


db Match 




sp:THID_SALTY 


> 
< 

CO 

i 

X 
t— 

Cl 

CO 


pir:H70830 


prf:2223339B 


prf:2120352B 


sp:YEBN_ECOLI 


CN 

i 

CO 

tn 

co 
r- 

LL. 

< 
CL 
Ol 




gp:SCI7_33 


gp:PSTRTETC1_6 


GP:PSTRTETC1_7 










pir:A70945 


prf:2317468A 


sp:LNT_ECOLI 




gp:AF188894_1 




Si 


CN 
O 

r- 


1584 


t 

O 

CO 


1314 


1386 


■*r 


o 
cn 
to 


CO 

cn 

CD 


co 

CO 


CO 
CO 

to 


1455 


CO 
CN 
T 


m 
to 


r- 
o 

CN 


cn 

CO 


o 
in 
r*- 


to 

CO 
CO 


o 

CO 


1635 




1224 


45 


Terminal 
(nt) 


1538963 


1539820 


1542119 


1546289 

I 


1546307 


! 1547967 


| 1549349 


1550398 


1550951 


1552237 


1553972 


I 1553297 


! 1554070 


1555067 | 


1554891 


1555086 


1556771 


1557014 


1557859 


1559497 


1560437 


50 


Initial : 
(nt} 


1539664 


CO 

o 
m 


1542922 


to 
cn 

m 


1547692 


1548440 


1548651 


1549403 


1550469 ' 


1551545 


15525181 


1553722 


I 1554684] 


1554861 


1555079 


1555835 


1556376 


1557823 


1559493 


1560237 


1561660 j 






5116 


5117 


5118 


5119 


5120 


5121 


5122 


5123 


5124 


5125 


5126 


5127 


5128 


5129 


5130| 


5131 


5132 


5133 


CO 

tn 


5135 


5136 


55 


Sol 

GO 2 O 


CO 
CO 


to 


1618 


cn 
5 


1620 


CN 

to 


CN 
CN 
CO 


I CO 
CN 
> CO 

L _ ... .. 


CN 
CO 


m 

CN 
CO 


to 

CN 
CO 


t^. 

CN 

to 


CO 
CN 
CO 


o> 

CN 
CO 


1630 


CO 
CO 


j 1632 


1633 


ro 
CO 


m 

CO 
CO 


9191 
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2 ~ 



■|t. 

CO 



So « 
I co ^ 



a) 

C 



CN 
C 



a> xi 
U 8 

o or 

o 

5? 



to 

Q. E 



E 8 

w O 
a. co 



Q 

LU 

CO 

Q_ 
CO 

o 
u 



s 



(O 2 O I 



£ 2 

X) > 
o or 

o 

2 x 



O 
> 
2 



>< 



a. q 

CD 

£0 Q- 



O 
UL 
< 

id 



i- j CM 



15 

CO o 



CO > 



CO 
< 
LU 



to 



O 



CM 

5 



o 
o 

LU 

I 

o 

5 



<D CO 
O CJ 



CD 



Is 
8 » 

XI > 

o cr 



a 



>- 



CO 
CD 



.2 CN 



> 
5 

I 

CD 
CO 



JQ > 

o 0C 
2 X 



O 

r-- 
ro 
> 



x» > 

o cr 

^ CO 

2 X 



E £ 

5 > 

o cr — 
<_> E 

S X < 
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Function 


AAA family ATPase (chaperone-Iike 
function) 


protein-beta-asparlate 
methyltransferase 


aspartyl aminopeptidase 


hypothetical protein 


virulence-associated protein 


quinolon resistance protein 


aspartate ammonia-lyase 


ATP phosphoribosyltransferase 


beta-phosphoglucomutase 


5-methyltetrahydrofolate- 
homocysteine methyltransferase 




alkyl hydroperoxide reductase 
subunit F 


arsenical-resistance protein 


arsenate reductase 


arsenate reductase 




cysteinyl-tRNA synthetase | 


15 




Matched 
length 
(aa) 


LO 
ID 


CO 
CN 


CD 
CO 


CO 
CD 
CN 


O) 
CD 


m 
co 
ro 


to 

CN 

m 


V 

00 
CN 


in 

Ol 


T 

m 

CM 




CO 
CO 
CO 


CO 

OO 
CO 


CO 
CNi 


CO 
CN 




CO 
CO 










































20 




Similari 
(%) 


78.5 


79.0 


CN 
CD 


71.4 


72.5 


61.0 


99.8 


97.5 


CO 

to 


62.4 




49.5 


Ol 
CO 
CD 


64.3 


75.6 




64.3 






Identity 


51.6 


57.3 


_T~ _ 

CO 
CO 


in 


40.5 


21.3 


■ l 
99.3 


to 

Ol 


30.3 


31.(3 




CN 
CM 


33.0 


CM 
CO 


r- 




in 

CO 


25 
30 
35 


Table 1 (continued) 


Homologous gene 


Rhodococcus erythropolis arc : 


Mycobacterium leprae pimT 


Homo sapiens 


Mycobacterium tuberculosis 
H37Rv RV2119 


Dichelobacter nodosus A198 
vapl 


Staphylococcus aureus norA23 


Corynebacterium glutamicum 
{Brevibacterium flavum) MJ233 
aspA I 


Corynebacterium glutamicum 
AS019hisG 


Thermotoga maritima MSB8 
TM1254 


Escherichia coli K12 metH 




Xanthomonas campestris ahpF 


Saccharomyces cerevisiae 
S288C YPR201Wacr3 


Staphylococcus aureus plasmid 
pl258 arsC 


Mycobacterium tuberculosis 
H37Rv arsC 




Escherichia colt K12 cysS 


40 




db Match 


a 

CN 
CO 
CO 
CN 
CN 

■^r 

CN 

iz 
a. 


co 

CN 

r- 

CO 

1 "5. 


o l 
m 
o 
in 
o 
o 

UL 
< 
CL 
Ul 


pir:B70513 


O 

O 
< 

m 

i 

< 
> 

Cl 
to 


prf:2513299A 


sp:ASPA_CORGL 


gp:AF050166_1 


pir:H72277 


_j 
O 
u 

UJ 

1 

X 

h- 

LU 
Cl 

CO 




sp:AHPF_XANCH 


i ■ ■ 

sp:ACR3_YEAST 


sp:ARSC_STAAU 


pir:G70964 




sp;SYC_ECOLI 






ORF 
(bp) 


1581 


■<T 
CO 
CO 


1323 


*r 

CO 
CO 


*T 
CD 
CN 


1209 


1578 


CO 
TT 
00 


CO 
Ol 

to 


3663 


o 
r-» 
m 


1026 


1176 


o 

CN 
T 


Ol 
CO 
CD 


CO 

r- 
co 


1212 


45 




Terminal 
(nt) 


1576951 


1578567 


1579449 


1581640 


1582114 


1582273 


1583913 


1585603 


1586812 


1587573 


1591912 


1591941 


1594512 


! 1594951 


1595668 


1595844 


1596249 | 


50 




Initial 
(nt) 


1578531 


1579400 


1580771 


1580807 


1581851 


\ 1583481 


1585490 


1586445 


15875041 


1591235 


1591343 


1592966 


1593337 


1594532 


^1595030 


j 1596221 


\ 1597460 






SEQ 
NO. 
(aa.) 


m 

LO 


5155 


5156] 


5157 


5158 


5159 


5160 


5161 


1 5162 j 


5163 


5164 
5165 


,5166 


5167 


5168 


5169 


5170 


55 




SEQ 
NO. 
(DNA) 


'ST 

m 

CD 


in 
m 

CD 


CD 

in 

CD 


r- 
m 

CD 


CO 

m 

CD 


cn 
in 
to 


1660 


to 
to 


j 1662 


CO 
CO 
CO 


V 

to 

(O 

I 


m 
to 
to 


1666 


r» 

to 
, to 


CO 
CD 

to 


Ol 
CO 

to 


o 
to 
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50 
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Function 


bacitracin resistance protein 


oxidoreductase 


lipoprotein 


dihydroorotate dehydrogenase 






transposase 




bio operon ORF I (biotin biosynthetic 
enzyme) 


Neisserial polypeptides predicted to 
be useful antigens for vaccines and 
diagnostics 




ABC transporter 




ABC transporter 




puromycin N-acetyltransferase 


LAO(lysine, arginine, and 
ornithine)/AO (arginine and 
ornithinejtransport system kinase 


methylmalonyl-CoA mutase alpha 
subunit 


Matched 
length 
(a.a) 


m 
m 

CN 


to 

CN 
CO 


o 

CO 


T 
CO 
CO 






o 

CO 
CO 




CN 
LO 


co 

CO 




r- 
co 
in 




m 

CO 

m 




CO 

in 


CO 
CO 
CO 


TT 

h- 


Similarity 
(%) 


69.4 


62.6 


53.5 


67.1 






55.3 




75.0 


33.0 




68.7 




67.1 




56.4 


72.3 


87.5 


.tr ^ 

■o 


37.3 


33.4 


27.0 


o 

TT 






r- 

tT 
CO 




4411 


26J0 




43.6 




36.9 




oi 

co 


CO 

T 


Z'ZL 


Homologous gene 


Escherichia coli K12 bacA 


Agrobacterium tumefaciens i 
mocA 


Mycobacterium tuberculosis 
H37Rv IppL 


Agrocybe aegerita ural 






Pseudomonas syringae tnpA 




Escherichia coli K12 ybhB I 


Neisseria meningitidis 




Corynebacterium striatum M82B 
tetB 




Corynebacterium striatum M82B 
tetA 




Streptomyces anulatus pac 


Escherichia coli K12 argK 


Streptomyces cinnamonensis 
A3823.5 mutB 


J db Match 


sp:BACA_ECOLI 


prf:2214302F 


pir:F70577 


LU 
< 

cr: 
O 

<l 
a 
or 
> 

CL 
cL 

V) 






gp.PSESTBCBAD 
1 




—i 
o 
o 

LU 

1 

CD 

X 
DO 
> 

CL 
CO 


GSP.Y74829 




prf:2513302A 




prf:2513302B 




pir:JU0052 


sp:ARGK_ECOLI 


sp;MUTB_STRCM 


Si 


co 

r- 
CO 


CO 
O) 


CO 
CO 

CO 


1113 


m 

CO 


o 

CO 


out 


CD 

co 

TT 


CO 

lo 


CO 
CN 

r-- 


co 
o 

CO 


1797 


CO 

TT 

CN 


1567 


m 

CO 


CO 

o 

CO 


1089 


2211 


Terminal 
(nt) 


1597745 


1599614 


1600677 


1601804 


1601931 i 


1603466 ! 


1604629 


1604830 


1605281 


1606689 


1608248 


1605861 


1609335 


1607661 


1609842 


1610844 


1611150 


1612234 


Initial 
(nt) 


1598623 


1598667 


1599679 


1600692 


1602281 | 


1602660 


1603520 


1605315! 


1605811 


1605961 


1607646 


1607657 


1609087 


1609247 


1610192 


1610236 


1612238 


1614444 


SEQ 

NO. 
(a.a.) 


5171 


5172 


5173 


5174 


5175 


5176 


5177 


5178; 


5179 


5180 


5181 


5182 


5183 


5184 


5185 


5186 
5187 


5188 


SEQ 
NO. 
(DMA) 


1671 


1672 


1673 


1674 


1675 


1676 


r- co 
r- ! r- 

CO j CO 


1679 


1680 


1681 


1682 


1683 
1684 


1685 


1686 
1687 


1688 
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Function 


methylmalonyl-CoA mutase beta 
subunit 


hypothetical membrane protein 




hypothetical membrane protein 


hypothetical membrane protein 


hypothetical protein 




ferrochelatase 


invasin 




aconitate hydratase 


transcriptional regulator 


GMP synthetase 


hypothetical protein 


hypothetical protein 




hypothetical protein 


15 




Matched 
length 


o 

CD 


CN 
CN 




o 

CO 




CD 
CN 




CO 
CO 


to 




CD 

in 

CD 




tn 

CO 
CM 


CM 
CN 


CO 
CO 




CO 


20 




Similarity 


68.2 


| 70.1 




87.0 


j 78.7 


72.8 




65.7 


56.5 




85.9 


81.6 


51.9 


[ 62.0 


80.2 




86.1 






Identity 

j (%) 


41.6 


r- 
co 




64.1 


TT 


51.0 




36.8 


25.5 




69.9 


5416 


2113 


_cp 

CN 
CO 


CM 

CO 




61.2 


25 
30 
35 


Table 1 (continued) 


Homologous gene 


Streptomyces cinnamonensisi 
A3823.5 mutA 


Mycobacterium tuberculosis 
H37RvRv1491c 




Mycobacterium tuberculosis 
H37RV Rv1488 


Mycobacterium tuberculosis 
H37RvRv1487 


Streptomyces coelicolor A3(2) 
SCC77.24 




Propionibacterium freudenreichii 
subsp. Shermanii hemH 


Streptococcus faecium 




Mycobacterium tuberculosis 
H37Rv acn 


Mycobacterium tuberculosis 
H37RvRv1474c 


Methanococcus jannaschii !; 
MJ1575 guaA 


Streptomyces coelicolor A3(2)| 
SCD82.04C 


Methanococcus jannaschii 
MJ1558 




Neisseria meningitidis MC58 ii 
NMB1652 


40 




db Match 


o 
q: 
t- 

CO 

<• 

r- 
CL 


sp:YS13_MYCTU 




sp:YS09_MYCTU 


r- 
O 
r^- 

m 

"cL 


gp:SCC77_24 




a: 
u_ 
O 

a 

i 

M 

5 

LU 
X 

CL 
<S) 


o 

Ll_ 
h- 

-z. 

LJ 

I 

m 

CL 

d 
cn 




pir:F70873 


pir:E70873 


CO 

cn 
■^r 

T 

CD 
Ll. 

5. 


i 

CM 
CO 

Q 
O 
CO 
cL 
cn 


o> 

TT 

"er 
to 
LU 

l: 
Ol 




gp:AE002515_9 






u 


1848 


CO 
CM 


r- 
co 
m 


1296 


in 

CO 


CO 
CO 


CO 
CO 

r- 


1110 


1800 


CO 
CD 

-or 


2829 


•<t 
to 
m 


to 
tn 
r- 


CO 

to 
to 


r^- 
to 

CN 


ro 

CD 
CO 


1392 


45 




Terminal 
(nt) 

l 


I 

j 1614451 


1617300 


1617994 


1618321 


, 1619672 


; 1620167 


1621838 


1621841 


1623027 


1625428 


1629107 


1629861 


1630668 


1630667 


1631926 


1631353 


1633324 


50 




Initial 
(nt) 


1616298 


1616578 


1617398 


1619616 


1620105 


1621009 


1621056 


1622950 


1624826 


1625925 


1626279; 


1629298 j 


1629913 


1631329 


1631660 


1631745 


1631933 






SEQ 
NO. 
(a.a) 


5189 


1 

5190 


5191 


5192 


5193 


5194 


5195 


5196 


5197 


5198 


5199 


5200 j 


5201 


5202 


5203 


5204 


5205 


55 




SEQ 
NO. 
(DNA) 


CD 
CO 
CD 


o 

CO 
CD 


cd 

CD 


CN 
CD 
CO 


CO 

cn 

CD 


T 

CO 
CO 


m 

CD 
CD 


CO 
CO 
CO 


r-~ 
o> 

CD 


8691 


CD 
CD 

to 


o 
o 


o 


CM 
O 

r-- 


CO 

o 
r- 


T 
O 

n- 

! 


I in 
o 
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ft 

CO 



55 



o 
E 
o 
X 



Q_ 

0) 
to 
m 
Q_ 

CO 



to ro 



c ^ 

>N — 

CO t/i 



> 

CO 

I 

< 

< 



2o 

CO 2 



S2Q 



L> CN 

>nO 

E - 

O T- 

OJ CO 

-fc O 
CO CO 



- CM 
CN 



2? to 

CO Q. 



CM 

tz 



> < 



si 

o O 
P 



> 



X 
< 



co 
CN 



Em 

=J CL 

E ~ 

is 

cn cl 

E ^ 

.2 E 

« s 

S 9- 

Q> *~ 

C CO 

O CN 
O CN 



E 

CD 



_CL> 

~o 

CD 

I- 

of 



ro 



< 
Cl 

o 



7 

CL 

a 



CN 



tn 

(_> r*- 

u GO 

o o 

y t- 

O [j) 

>,< 

CL CL 



CL-) 



So 
^ — i 
5 2 
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Function 


hypothetical protein 


i nitrogen fixation protein 


ABC transporter ATP-binding protein 


hypothetical protein 


ABC transporter 


DNA-binding protein 


hypothetical membrane protein 


ABC transporter 


hypothetical protein 


hypothetical protein 




helicase 


quinone oxidoreductase 


cytochrome o ubiquinol oxidase 
assembly factor / heme 0 
synthase 


transketolase 


transaldolase 




15 




Matched 
length 

(a.a) 








































CN 

in 


t 


CN 

m 

CN 


CO 


PO 

cn 


CN 


co 

UO 


CO 


CD 
CO 
CN 


cn 

CN 




CO 


CO 
CN 
CO 


in 
cn 

CN 


m 

r-. 
co 


CO 

in 

CO 
























































































o 




CO 


o 


o 




CO 


CO 


CO 


CO 




O 


cn 


CO 


o 


CN 




20 




is 

to 


in 


CO 


oi 

CO 


CO 
CO 


CO 

r-. 




CO 


r-- 


r-- 


r»- 




in 


o 
r-- 


CO 
CO 


8 


in 
co 








Identity 


o 


h- 


CN 


CN 


o 




CO 


CN 


o 


o 




rr 


m 


CO 


o 


o 










CO 


o 


•n 
m 




XT 


CO 


in 




c*V 




co 
CN 


CO 


CO 


10(3 


CN 
CD 




25 






in 


























Nitrobacter winogradskyi coxC 








30 
35 


Table 1 (continued) 


Homologous gene 


Aeropyrum pernix K1 APE202 


Mycobacterium leprae nifS | 


Streptomyces coelicolor A3(2) 
SCC22.04c 


Mycobacterium tuberculosis 
H37Rv Rv1462 ■ 


CO 
O 
CO 
CO 

o 

O 

CL 

CL 

</> 
tf> 

"to 

>% 

o 
o 

jc 
o 

0) o 
CO co 


Streptomyces coelicolor A3(2) 
SCC22.08c 


Mycobacterium tuberculosis 
H37RvRv1459c 


Mycobacterium leprae 
MLCL536.31 abc2 


Mycobacterium leprae 
MLCL536.32 


Mycobacterium tuberculosis 
H37Rv Rv1456c 




Pyrococcus horikoshii PH045( 


Escherichia coli K12 qor 


Corynebacterium glutamicum 
ATCC 31833 tkt 


Mycobacterium leprae 
MLCL536.39tal 


































CO 
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db Match 


*:C72506 


S72761 


CN 1 

CN 
O 

o 

CO 


A70872 


CO 
> 

> 
CO 

o 
>- 
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LL 
< 

CL 
Ol 




pir.H70968 


pir:C70528 


sp:RND_HAEIN 


gp:AB026631J 


pir:E72298 




pir:C70530 


sp:DUT_STRCO 


o 

CO 

in 
o 
r- 
UJ 

O- 






si 


CD 
O 

ro 


CM 
CO 

■^r 


in 

CO 


CO 
CO 

ro 


CO 
Ol 
CO 


1254 


CO 

o 


CD 
CM 


CO 
Ol 
CO 


CN 
CO 


1263 


1908 


1236 


CN 
CO 
CN 


CO 
CO 




CD 

m 


r~ 
o 

CM 


45 


Terminal 
(nt) 


1 

1995783 


1996537 


1997112 


1997503 


1998240 


1999542 


1999949 


1999707 


2000521 


2002112 


2003334 


2003402 


2005462 


2006979 


2006777 


2007738 


2008798 


2008876 


50 


Initial 
(nt) 


1996088 


1996106 


1996768 


1997168 


1997545 


1998289 


1999542 


2000132 


2001216 


2001489 


2002072 


2005309 


2006697 


2006698 


2007637 


2008184 


2008250 


2009082 




SEQ 

NO. 
(a.a.) 


5579 


5580 


5581 


5582 


5583 


5584 


5585 


5586' 


5587 


5588 I 


5589 


5590 


5591 


5592 


5593 


5594 


5595 


5596 


55 


SEQ 
NO. 
(DNA) 


2079 


2080 


12081 


2082 


CO 
CO 

o 

i N 


2084 


2085 


2086 


2087 


2088 


2089 | 


2090 


2091 


2092 


2093 


2094 


2095 


2096 
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CD 
ZD 



Function 


hypothetical protein 


extragenic suppressor protein 


polyphosphate glucokinase 


sigma factor or RNA polymerase 
transcription factor 


hypothetical membrane protein 




hypothetical protein 


hypothetical membrane protein 


hypothetical protein 


transferase 


hypothetical protein 


iron dependent repressor or 
diphtheria toxin repressor 


putative sporulation protein 


UDP-glucose 4-epimerase 




hypothetical protein 


ATP-dependent RNA helicase 


Matched 
length 
(aa) 


o 
o 


CO 
CO 


CO 
T 
CM 


o 
o 

LO 


CM 
CN 




CO 

r- 

LO 


CN 


CO 


CO 

CM 
UO 




CO 
CM 
CM 


r- 


O) 
CN 
CO 




to 
o 

CO 


CO 

to 


Similarity 


81.0 


68.2 


80.2 


98.6 


51.4 




80.8 


59.1 


85.5 


61.2 


100.0 


99.6 


64.0 


99.1 




79.0 


50.7 


Identity 
(%) 


o 

CO 


38.4 


54.4 


98.0 


23.9 




611.3 


32.3 


CO 
CO 


in 

CO 


97.2 


98.7 


62.0 I 


99.1 




45J3 


24.4 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv2699c 


| Escherichia coli K12 suhB 


Mycobacterium tuberculosis 
H37Rv RV2702 ppgK 


Corynebacterium glutamicum 
sigA 


Bacillus subtilis yrkO 




Mycobacterium tuberculosis \ 
H37RvRv2917 


Mycobacterium tuberculosis ' 
H37Rv Rv2709 \ 


Mycobacterium tuberculosis 
H37Rv Rv2708c j 


Streptomyces coelicolor A3(2) 
SCH5.08c 


Corynebacterium glutamicunv 
ATCC 13869 ORF1 


Corynebacterium glutamicum 
ATCC 13869 dtxR 


Streptomyces aureofaciens 


Corynebacterium glutamicum 
ATCC 13869 {Brevibacterium : 
lactofermentum) galE 




Mycobacterium tuberculosis > 
H37Rv Rv2714 


Saccharomyces cerevisiae 
YJL050Wdob1 


db Match 


pir:F70530 


_j 
O 
o 

LU 

1 

CO 
X 

CO 

CL 
V) 


sp:PPGK_MYCTU 


< 

CO 
CO 
CN 
T 
O 
CN 
CN 
4-' 

Q. 


sp:YRKO_BACSU 




sp:Y065_MYCTU 


pir:H70531 


pir;G70531 


gp:SCH5_8 


CJ 
CO 
CO 
CM 
TT 

o 

CN 
CN 

f 
Q_ 


pir:l40339 


GP:AF010134_1 


UJ 

cr 

CO 

id' 
_J 

< 
o 

Cl 


i 


pir:E70532 


sp:MTR4_YEAST 




O) 
CM 


CO 
CO 


CO 
CN 

oo 


1494 


1335 


h~ 

CO 
LO 


1710 


CO 
CO 
CO 


r- 
co 
CM 


1533 


CN 
CO 


T 

' CO 
CD 


■^r 

CO 
CM 


co 
O) 


1323 


in 
o> 


2550 


Terminal 
(nt) 


2009280 


| 2009724 


2011382 


2013356 


2014162 


2015585 


2016257 


2018754 


2017966 


2020276 


2020724 


2022949 


2022313 


2023945 


2023948 


2026379 


2029043 


Initial 
(nt) 


2009570 


2010539 


2010555 


2011863 


2015496 


2016121 


2017966 


2018119 


2018202 


T 

CO 

o 

CN 


2020293 


2022266 


2022546 


2022959 


2025270 


2025423 


2026494 


SEQ 
NO. 
(a.a.) 


5597 


5598 


5599 


5600 


5601 


5602 


5603 


5604 


5605 


5606 


5607 


5608 


5609 


5610 


5611 


5612 


5613 


O o < 
W 2 C| 


2097 


2098 


2099 


2100 


2101 


(2102 


j 2103 


2104 


2105 


2106 


2107 


2108 


si 




2110 


2111 


2112 


2113 
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Z3 



Function 


hydrogen peroxide-inducible genes 
activator 




ATP-dependent helicase 


regulatory protein 




SOS regulatory protein 


galactitol utilization operon repressor 


phosphofructokinase (fructose 1- 
phosphate kinase) 


phosphoenolpyruvate-protein 
phosphotransferase 


glycerol-3-phosphate regulon 
repressor 


1-phosphofructokinase or 6- 
phosphofructoktnase 


PTS system, fructose-specific IIBC 
component 


phosphocarrier protein 




uracil permease 


ATP/GTP-binding protein 






diaminopimelate epimerase 


Matched 
length 
(aa) 


cn 
cn 

CM 




1298 


in 




CM 
CM 
CN 


in 

CN 


o 

CN 
ro 


CN 

cn 

LO 


CM 
CO 
CN 


m 
ro 


cn 
in 


CO 




r- 
o 


cn 






cn 
CO 
CN 










































Similari 
(%) 


65.6 




76.2 


CN 
CD 
CO 




71.6 


67.8 


55.6 


64.0 


62.6 


55.7 


69.6 


71.6 




70.5 


80.0 






64.7 










































Identil 


35.8 




49.2 


CD" 




cn 


33.9 


CN 
CN 


CO 
CO 


26.7 


33.0 


43.0 


o 

CO 




ai 
ro 


s 






33.5 


Homologous gene 


Escherichia coli oxyR 




Escherichia coli hrpA 


Streptomyces ctavuligerus nrdR 




Bacillus subtilis dinR 


Escherichia coli K12 gatR 


Streptomyces coelicolor A3(2) 
SCE22.14c 


Bacillus stearothermophilus ptsi 


Escherichia coli K12 glpR 


Rhodobacter capsulatus fruK 


Escherichia coli K12 fruA 


Bacillus stearothermophilus XL- 
65-6 ptsH 




Bacillus caldolyticus pyrP 


Streptomyces fradiae orfl 1* 






Haemophilus influenzae Rd 
KW20 HI0750 dapF 


db Match 


t 

o 
u 

LU 

I 

or 
> 

X 

O 

o. 

CD 




_j 
O 
o 

LU 

<' 

O- 

or 

X 
cL 

tS) 


gp:SCAJ4870_3 




sp:LEXA_BACSU 


sp:GATR_ECOLI 


T 

*~\ 
CN 
CN 
LU 
O 
CO 
Cl 
cn 


00 

o 
< 

CO 

Cl 
cL 
</) 


sp:GLPR_ECOLI 


sp:K1PF RHOCA 


sp:PTFB_ECOLI 


sp:PTHP_BACST 




_ j 
o 
o 
< 

CO 

a 1 
cr 
>- 

Q_ 

oL 

tn 


gp:AF145049_8 






•z. 

LU 
< 

Q_ 
< 
Q 

CL 

tn 


n 


CO 

cn 


1089 


3906 


o 
in 
■*r 


O 
CM 


CO 
CD 
CO 




o 

CO 

cn 


1704 


CM 

cn 
r-- 


o 
cn 
o> 


1836 


to 

CN 


CM 
CO 

m 


1287 


1458 


CO 
CO 
!-■» 


CO 

in 


CO 
CO 


Terminal 
(nt) 


2030157 


2030277 


2035383 


2035431 


2035990 


2037507 


2038591 | 


2039550 


2039618 


2042519 


2043508 


2045571 


2046028 


2046714 


2047320 


2048650 


2051106 


2051842 


2051845 


Initial 
(nt) 


2029177 


2031365 


2031478 


2035880 


2036409 


2036812 


2037815 


2038591 


2041321 


20417281 


2042519 


2043736 


2045762 


2047295 


2048606 


2050107 


2050321 


2051306 


2052675 


SEQ 
NO. 
(a.a.) 


5614 


5615 


5616 


5617 


5618 


5619 


5620J 


5621 


5622 


5623 


5624 


5625 


5626 


5627 


5628 


5629 


5630 [ 


5631 


5632 


SEQ 
NO. 
(ONA) 


2114 


2115 


2116 


r- 

CM 


2118 


[2119 


o j 

CN I 

SI 


f 

j 2121 


2122 


2123 


2124 


2125 


2126 


sl 


CO 
CN 

CN 


2129 


2130 


2131 


2132 
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Function 


tRNA delta-2- 
isopenteny (pyrophosphate 
transferase 




hypothetical protein 






hypothetical membrane protein 


hypothetical protein 


glutamate transport ATP-binding 
protein 


Neisserial polypeptides predicted to 
be useful antigens for vaccines and 
diagnostics 


glutamate transport system 
permease protein 


glutamate transport system 
permease protein 


regulatory protein 


hypothetical protein 




biotin synthase 


putrescine transport ATP-binding 
protein 


hypothetical membrane protein 


Matched 
length 
(aa) 


o 
o 
ro 




in 

T 
rr 






o 
cn 


rr 
cn 
rr 


CN 
rT 
CN 




m 

CN 
CN 


CO 

r-. 

CN 


CM 
rT 


co 




cn 


CO 
CN 
CN 


CO 
CN 
CM 


Similarity 


68.7 




r*- 

iri 
r-- 






63.7 


86.4 


99.6 


73.0 


100.0 


CO 
CO 

cn 


66.9 


71.6 




61.4 


69.5 


58.8 


Identity 
(%) 


o 

S3 
rr 




m 

UJ" 

rr 






29.0 


rr 

CO 
CO 


99)6 


cb 

CO 


100.0 


99.3 


rf 
CO 


o 
rr 




33.6 I 


33.2 


CO 
rr 
CM 


Homologous gene 


Escherichia coli K12 miaA 




Mycobacterium tuberculosis 
H37RvRv273l ! 






Mycobacterium tuberculosis 
H37Rv Rv2732c 


Mycobacterium leprae 
B2235_C2J95 


Corynebacterium glutamicum , 
ATCC 13032 gluA 


Neisseria gonorrhoeae 


Corynebacterium glutamicum 
ATCC 13032 gluC 


Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
13032 gluD 


Mycobacterium leprae recX 


Mycobacterium tuberculosis 
H37Rv Rv2738c 




Bacillus sphaehcus bioY 


Escherichia coli K12 potG 


Bacillus subtilis ybaF 


db Match 


_i 

o 
o 

LU 

*' 

CL 

to 




pir;B70506 






pir:C70506 


sp:Y195_MYCLE 


_i 
CD 

or 
o 
o 

<' 

3 

o 

tn 


GSP:Y75358 


sp:GLUC_CORGL 


sp:GLUD_CORGL 


sp:RECX_MYCLE 


pir:A70878 




sp:BIOY_BACSH 


sp:POTG_ECOLI 


pir:F69742 


u 


CO 

o 
o> 


m 
r-. 

CO 


1359 


1020 


1023 


cn 

CO 
CO 


1566 


CO 
CN 
r*- 


219 


rT 
CO 
CO 


cn 

CO 


cn 
in 


rr 

CO 

CN 


CO 
CO 


CO 

m 


o> 
cn 

CO 


cn 
o 

CO 


Terminal 
(nt) 


2052684 


| 2053609 


2055761 


2054724 


2056787 | 


2057120 


2057855 


2060499 


2060196 


2062312 


2063259 


2063298 


2065394 


2065667 


2067141 


2067866 


2068474 


Initial 
(nt) 


2053586 


2054283 


2054403 


2055743 


| 2055765 


2057788 


2059420 


2059774 


2060414- 


2061629 


rr 

CN 
CO 
O 
CM 


2063894 


2065627 


2066404 


2066566 


2067168 


2067866 


CO ^ ™. 


5633 


5634 


5635 


5636 


5637 


5638 


5639 


5640 


rr 
CO 

in 


5642 


5643 


rr 1 

co ; 


5645 


5646 


h- 

rr 
CO 

m 


CO 
rr 
CO 

in 


5649 1 


SEQ 
NO. 
(DNA) 

2133 


2134 


2135 


2136 
2137 

2138 | 


2139 


2140 


rr 

CM 


2142 
2143 


2144 


2145 
2146 


r»~ I eo 
rr j rr 

CN | CN 


cn 
rr 

CN 
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TO QJ 



CO 



co ^ e 



° rS < 
CO 2 o 



CO 



o3 
.o 

3 o 
|3 

™ > 

o q: 

2 X 



(J 
>- 



< 

Q 



5? 
8 * 

XI > 

o ce 

>>eo 
S X 



s 



O) 

co 
E 

"O 

a> 
o 
c 

0) c 

§ 2 

u a. 



o < 
^ c 
CO o 



2 
Q_ 



< 
2 



Cl 

O h- 

eg 
< < 



c c 

2 2 
o. a. 
a; <v 



o 
to 



o 

< 



a) 

E ^ 

o to 

±3 O 

CO to 



to 
0 



o 

CO 



.2 o 
u < 



O 
a: 
O 
u 

i 

or 
O 
>- 



if 

fi 



LLI 

tr 

CD 

a.' 
< 

Q 
>- 



to 

jt 
a. 

CO 

o 
-c 
a. 



§1 



CO 



o 
■o 



CM 



< 
in 
to 



CM 



O 1 



to 
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Function 


Afunctional protein (riboflavin kinase 
and FAD synthetase) 


tRNA pseudouridine synthase B 


hypothetical protein 


hypothetical protein 


phosphoesterase 


DNA damaged inducible protein f 


hypothetical protein 


ribosome-binding factor A 


translation initiation factor IF-2 


hypothetical protein 


n-utilization substance protein 
(transcriptional 

termination/antitermination factor) 




hypothetical protein 


peptide-binding protein 


peptidetransport system permease 


oligopeptide permease 


'peptidetransport system ABC- 
transporter ATP-binding protein 


Matched 
length 
(aa) 


CD 
CM 
CO 


co 
o 

CO 




r- 
co 
CM 


CO 
CM 


CO 
CO 


CO 

o 

CO 


CO 
O 


1103 


CO 
CO 


CM 

to 

CO 




in 

CO 


CO 

in 


ro 

CO 


CM 
Ol 
CM 


CM 
LO 

m 


Similarity 
(%) 


79.0 


61,7 


73.0 


62.5 


68.9 


78.8 


70.8 


70.4 


62.9 


66.3 


71.0 




65.5 


609 


69.4 


69.2 


81.3 


H 


CM 
CD 


CM 

"CO 


65.0 


42.2 


Ol 

-CD 


51.0 


36.7 


CO 


cO 


CD 


42.3 




to 

CO 


CO 

in 

CM 


r-." 

CO 


CO 
CO 


57.6 


Homologous gene 


Corynebacterium 
ammoniagenes ATCC 6872 ribF 


Bacillus subtilis 168 truB 


Corynebacterium 
ammoniagenes | 


Streptomyces coelicolor A3(2) 
SC5A7.23 


Mycobacterium tuberculosis 
H37Rv Rv2795c 


Mycobacterium tuberculosis 
H37Rv Rv2836c dinF 


Mycobacterium tuberculosis 
H37Rv Rv2837c 


Bacillus subtilis 168 rbfA 


Sttgmatella aurantiaca DW4 ihfB 


Streptomyces coelicolor A3(2) 
SC5H4.29 


Bacillus subtilis 168 nusA 




Mycobacterium tuberculosis 
H37Rv RV2842C 


Bacillus subtilis 168 dppE 


Escherichia coli K12 dppB 


Bacillus subtilis spoOKC 


Mycobacterium tuberculosis 
H37Rv Rv3663c dppD 


db Match 


< 
cn 
O 
o 

u_' 
m 
cn 
ol 

to 


3 
CO 
CJ 

< 

m ' 

3 

cn 
t- 

Cl 
to 


r- 
o 
o 

o 

Q_ 

cn 

CL 


CO 
CM 

1 

< 

LO 
O 

cn 

Cl 

Ol 


pir:B70885 


pir:G70693 


pir:H70693 


3 
CO 

o 
< 

CO 

<' 

u_ 

CO 

or 

to 


sp:IF2_STIAU 


Ol 

"i 

I 

LO 
O 

CO 
Cl 
cn 


3 
CO 

o 
< 

CD 

<' 
CO 
3 

cl 

to 




pir:E70588 


3 
CO 

o 
< 

CO 

1 

LU 
CL 
CL 
Q 

Cl 
(/> 


sp;DPPB_ECOU 


prf:1709239C 


CO 

co 
r- 
o 
r-- 
X 

L_- 
CL 


si 


1023 


cn 

GO 


CO 
CM 
CM 


LO 

CD 


o 

CO 


1305 


CD 
Ol 
Ol 


r-- 

T 


3012 


CD 
CO 
CO 


CO 
cn 

CD 


1254 


CO 

in 


1602 


CM 
Ol 


Ol 
Ol 
Ol 


1731 


Terminal 
(nt) 


2086919 


2088863 


2087954 


2089218 


2089861 


2090751 


2092051 


2093055 


2093712 


2096844 


2097380 


2099815 


2098412 


2101841 


2102946 


2103973 


2105703 


Initial 
(nt) 


j 2087941 


2087973 


2088181 


2089868 


2090664 


2092055 


2093046 


2093501 


2096723 


2097179 


2098375 


2098562 


2098945 


2100240 


2102023 


2102975 


2103973 


SEQ 
NO. 
(a.a.) 


5668 


5669 


5670, 


5671 


5672 


5673 


.5674 


5675' 


5676 


5677 


5678 


5679 


5680 


5681 


5682 


5683 
5684 


SEQ 
NO. 
(DNA) 

2168 


2169 


2170 


2171 


2172 


2173 

2174 
2175 


2176 


2177 


2178 

2179 
2180 


2181 


| 2182 


2183 


co 

CN 
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o_ 




CO 
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*o 


Cm 


o 


o 


J= 




DC 





X 
CO 

O 
X 

or 
Q 1 

X 

o 

CD 



CO 
oo 
CD 



CM 



at ™ 



3 E 
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O -Q 



CO 
CO 



CM 
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c Ul 



CL 

o 



u 
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a. 



CN^ 




CO 
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u 




~QJ 
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CO 




<D 
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X 


0) 


LO 
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CO 


CO 



o 

CO 
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CO 



CO I 
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CO 

O 
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a> 
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CL 

E _ 
So 
c o to 
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CO 
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O _c 
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Function 


ABC transporter 




hypothetical protein (gcpE protein) 




hypothetical membrane protein 


polypeptides can be used as 
vaccines against Chlamydia 
trachomatis 


1-deoxy-D-xylulose-5-phosphate 
reductoisomerase 








ABC transporter ATP-binding protein 


pyruvate formate-lyase 1 activating 
enzyme 


hypothetical membrane protein 


phosphatidate cytidylyltransferase 


ribosome recycling factor 


uridylate kinase 




elongation factor Ts 


30S ribosomal protein S2 


Matched 
length 
(a.a) 


m 

CN 
IN 




O) 

m 

CO 




m 
o 




CN 
CO 








in 

CN 


to 
in 

CO 


cn 


cn 

CN 


in 
00 


cn 
0 




0 

CO 
CN 


m 

CN 










































Similari 
(%) 


71.1 




73.8 




73.6 


43.0 


42.0 








75.1 


78.0 


74.5 


56.5 


84.3 


43.1 




76.8 


83.5 


Identity 
(%) 


CO 
CO 




44)3 




o 

CO 


36.0 


22.8 








37.1 


66.0 


41.5 


33.3 


0 


CO 
CN 




49.6 


10 
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hypothetical protein 
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transcriptional accessory protein 
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serine-rich secreted protein 






histidine secretory acid phosphatase 


tet repressor protein 


glycogen debranching enzyme 


hypothetical protein 


oxidoreductase 


myo-inositol 2-dehydrogenase 


galactito! utilization operon repressor 


ferrichrome transport ATP-binding 
protein or ferrichrome ABC 
transporter 


hemin permease 


iron-binding protein 


iron-binding protein 


hypothetical protein 


Matched 
length 




CO 

o 


CN 
CO 

co 


CO 

CO 


CN 
T 

ro 






CN 


o 

CN 


CN 
tN 


CO 

in 

CN 


CO 
CD 
CN 


CO 
CO 


Ol 
CN 
CO 


CO 
CN 


CN 
CO 
CO 


CO 

o 


CN 
CO 


CO 

T— 


Similarity 




81.8 


79.3 


85.7 


54.4 






59.7 


60.8 


75.5 


76.0 


55.2 


60.9 


^' 
CO 


68.3 


71.1 


68.0 I 


67.6 


73.5 


Identity 
(%) 




52,5 


CN 

fC 
m 


63J8 j 


CN 

r--' 

CN 






o> 

CN 


- ct? 
cd 
CN 




50.0 


CO 
CN 


35.6 | 


30.4 | 


32.9, 


CO 

CD 
CO 


d 

CO 


CO 
CO 


38.1 | 


Homologous gene 




Streptomyces coelicolor A3(2) 
hisB 


Streptomyces coelicolor A3(2), 
hisC ' ! 


Mycobacterium smegmatis 
ATCC 607 hisD 


Schizosaccharomyces pombe 
SPBC215.13 






Leishmania donovani SAcP-1 : 


Escherichia coli plasmid RP1 
tetR 


Sulfolobus acidocaldarius treX 


Mycobacterium tuberculosis 
H37Rv Rv2622 


Streptomyces coelicolor A3(2) 
SC2G5.27cgip 


Sinorhizobium meliloti idhA 


Escherichia coli K12 gaIR 


Bacillus subtilis 168 fhuC 


Vibrio cholerae hutC 


Bacillus subtilis 168 yvrC 


Baciflus subtilis 168 yvrC 


Escherichia coli K12 ytfH 


db Match 




sp:HIS7_STRCO 


sp:HIS8_STRCO 


sp:HISX_MYCSM 


CO 

'"t 

in 

CN 
O 

m 
o_ 

CO 
Ol 
cn 






< 

cn 

CD 
CN 

CN 
CO 
CN 
t" 
Q. 


pir:RPECR1 


cn 

CO 

o 

CN 

r-. 
o 

CO 
CN 

t: 

CL 


CN 

r-- 
tn 
o 
r- 
LU 
w 
Cl 


CN 

m' 

O 

CN 

u 

CO 
CL 
Oi 


prf.2503399A 


_i 
O 

o 

UJ 

I 

or 
< 

Cl 
tn 


sp:FHUC_BACSU 


UJ 

CO 
CN 

CN 
"t 
Ol 


CD 
T 
O 
O 

O 

Cl 


CD 
T 
O 
O 
r- 
CD 

El 


_i 
o 
u 

LU 

r 

LL 
h- 
> 
Cl 

V) 


ORF 


in 

CN 
CN 


CD 
O 
CD 


1098 


1326 


1200 


m 

CO 


cn 
o 

CO 


CN 

rr 

CO 


CO 

m 


2508 


O 

CO 


•^r 
r«- 


1011 


CD 
cn 

Ol 


OO 
Ol 

r-- 


1038 


CO 
CO 


■^r 
cn 
m 


t 


Terminal 
(nt) 


2215639 


2215869 


2216494 


2217600 


2220358 


2220459 


2221919 


2221187 


2222518 


2225035 


2225949 


2225990 


2226769 


2228901 


2229099 


2229900 


2230947 


2231339 | 


2232016 


Initial 
(nt) 


2215863 


2216474 


2217591 


in 

CN 
CO 
CO 

CN 

CN i 


2219159 


2221109 


2221611 


2221828 


2221958 


2222528 


2225149 


2226763 


2227779 


2227906 


2229896 


2230937 


2231294 


2231932 


2232456 


2 b «: 

CO ^ « 


5796 


5797 


5798 


5799 


5800 
5801 


5802 
5803 


5804 


5805 


5806 


5807 


5808 


5809 


5810 


5811 


5812 


5813 


5814[ 


SEQ 
NO. 
(DNA) 


2296 


2297 


2298 


2299 


2300 
2301 


2302 
2303 


2304 


2305 | 


2306 


2307 


2308 


2309 


2310 


CO 


2312 


2313 


2314 



156 



EP 1 108 790 A2 



5 
10 


Function 


DNA polymerase III epsilon chain 
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hypothetical protein 
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Corynebacterium glutamicum AS019 


DNA polymerase lit 


chloramphenicol sensitive protein 


histidine-binding protein precursor 


hypothetical membrane protein 
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Table 1 (continued) 


Homologous gene 


Streptomyces coelicolor A3(2) 
SCI8.12 




Arthrobacter sp. Q36 treY 


Deinococcus radiodurans 
DR1631 










Photorhabdus luminescens 
ATCC 29999 luxA 


Streptomyces coelicolor A3(2) 
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Arthrobacter sp. Q36 treZ 


Bacillus subtilis 168 


Corynebacterium glutamicum 
ATCC 13032 ilvA 






Catharanthus roseus metE 
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short chain dehydrogenase or 
general stress protein 


diaminopimelate (DAP) 
decarboxylase 


cysteine synthase 




ribosomat large subunit 
pseudouridine synthase D 


lipoprotein signal peptidase 




oleandomycin resistance protein 




hypothetical protein 


L-asparaginase 


DNA-damage-inducible protein P 


hypothetical membrane protein 


transcriptional regulator 
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isoleucyl-tRNA synthetase 
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Function 


hypothetical membrane protein 


3-deoxy-D-arabino-heptulosonate-7- 
phosphate synthase 


hypothetical protein 


hypothetical membrane protein 


major secreted protein PS1 protein 
precursor 






hypothetical membrane protein 


acyltransferase 


glycosyl transferase 


protein P60 precursor (invasion- 
associated-protein) 


protein P60 precursor (invasion- 
associated-protein) 


ubiquinol-cytochrome c reductase 
cytochrome b subunit 


ubiquinol-cytochrome c reductase 
iron-sulfur subunit (Rieske (eFe-2S) 
iron-sulfur protein cyoB 


ubiquinol-cytochrome c reductase 
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Mycobacterium tuberculosis 
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Corynebacterium glutamicum 
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Heliobaciilus mobilis petB 
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Function 




heme oxygenase 


glutamate-ammonia-ligase 
adenylyltransferase 


glutamine synthetase 


hypothetical protein 


hypothetical protein 


hypothetical protein 


galactokinase 


virulence-associated protein 




bifunctional protein (ribonudease H 
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Function 




transcriptional regulator 




hypothetical protein 




pyruvate dehydrogenase component 




ABC transporter or glutamine 
transport ATP-binding protein 




ribose transport system permease 
protein 


hypothetical protein 


calcium binding protein 




lipase or hydrolase 


acyl carier protein 


N-acetylglucosamine-6-phosphate 
deacetylase 


hypothetical protein 
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58.7 


62.9 
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75.5 
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Table 1 (continued) 


Homologous gene j 




Streptomyces coelicolor A3(2) 
SC8F4.22c 




Mycobacterium tuberculosis 
H37Rv Rv2239c 




Streptomyces seouiensis pdhA 




Escherichia coli K12 glnQ 




Bacillus subtilis 168 rbsC ' ! 


Rickettsia prowazekii Madrid E 
RP367 


Dictyostelium discoideum AX2 ., 
cbpA 




Streptomyces coelicolor A3(2) „ 
SC6G4.24 


Myxococcus xanthus ATCC 
25232 acpP 


Escherichia coli K12 nagD 


Deinococcus radiodurans 
DR1192 
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CM 
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CO 
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m 
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b. 
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co 
CO 
CO 
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u 
Cl 
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CN 

CD 

CO 

U 
CO 

CL 

cn 
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V) 


i 

CO 

8 
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o 
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co 
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CO 
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CN 


in 
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CO 


CO 
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co 
co 

CO 
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CO 
CO 


o 

CO 


CN 

r- 
co 
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CO 
CM 


m 

CN 
CO 


1032 
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45 
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2377484 


2378276 


2378489 


I 

2378884 


2379770 


2382744 


2380765 


I 
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2385426 


2383622 
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2386580 
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2388821 


2389869 


t 

CO 

*r 
o 

CO 
CO 
CM 


SO 
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2377726 i 


2377899 


2378292 


2379312 


2379426 


2380033 
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2383615 ( 


2384464 


2384509 1 


2385447: 


2385771 


2386284 
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2387997 


2388838 


2390904 | 
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CO 2 Q 
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Function 


hypothetical protein 












alkaline phosphatase D precursor 




hypothetical protein 


hypothetical protein 




ONA primase 


ribonuclease Sa 






L-glutamine;D-fructose-6-phospha1 
amidotransferase 






deoxyguanosinetriphosphale 
triphosphohydrolase 


hypothetical protein 
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Homologous gene 


Streptomyces coelicolor A3(2) 
SC4A7.08 












Bacillus subtilis 168 phoD ! 




Streptomyces coelicolor A3(2) 
SCI51.17 


Mycobacterium tuberculosis 11 
H37Rv Rv2342 „ 




Mycobacterium smegmatis 
dnaG 


Streptomyces aureofaciens BMK 






Mycobacterium smegmatis 
mc2155glmS 






Mycobacterium smegmatis dgt | 


Neisseria meningitidis NMA025|1 
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Function 


isopentenyl-diphosphate Delta- 
isomerase 












beta C-S lyase (degradation of 
aminoethylcysteine) 


branched-chain amino acid transport 
system carrier protein (isoleucine 
uptake) 


alkanal monooxygenase alpha chain 




malonate transporter 


glycolate oxidase subunit 


transcriptional regulator 




hypothetical protein 




heme-binding protein A precursor 
(hemin-binding lipoprotein) 


oligopeptide ABC transporter 
(permease) 


dipeptide transport system 
permease protein 


oligopeptide transport ATP-binding 
protein 
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Homologous gene 


Chlamydomonas reinhardtii ipil 












Corynebacterium glutamicum 
ATCC 13032 aecD 


Corynebacterium glutamicum 
ATCC 13032 bmQ 


Vibrio harveyi luxA 




Sinorhizobium meliloti mdcF 


Escherichia coli K12 glcD 


Escherichia coli K12 ydfH I 




Salmonella typhimurium ygiK 




Haemophilus influenzae Rd 
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Bacillus subtilis 168 appB i 


Escherichia coli K12 dppC 


Escherichia coli K12 oppD 
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Function 


hypothetical protein 


hypothetical protein 


ribose kinase 


hypothetical membrane protein 




sodium-dependent transporter or 
odium Bite acid symporter family 


apospory-associated protein C 




thiamine biosynthesis protein x 


hypothetical protein 


glycine betaine transporter 








large integral C4-dicarboxylate 
membrane transport protein 


small integral C4-dicarboxylate 
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Function 


xanthine permease 


2,5-diketo-D-gluconic acid reductase 






SOS ribosomal protein L27 


50$ ribosomal protein L21 


ribonuclease E 








hypothetical protein 


transposase (insertion sequence 
IS31831) 


hypothetical protein 


hypothetical protein 


nucleoside diphosphate kinase 




hypothetical protein 


hypothetical protein 




hypothetical protein 


15 




Matched 
length 
(aa) 


CN 
CM 


CM 






CO 


o 


to 

CO 
CO 








tn 

CO 


to 

CO 




CO 


CO 
r— 




CM 
CO 


CN 


CO 


20 




Similarity 
(%) 


77.3 


81.9 






92.6 


82.2 


56.6 








82.6 


100.0 


76.9 


67.8 


89.6 




67.4 


64.3 


68.6 






Identity 
(%) 


cn 
cn 


61.2 






80.3 


to 


30.1 








o 
to 


99.1 


51.3 


37.8 


70.9 




CO 
*T 
CO 


3616 


33l9 


25 
30 
35 
40 


Table 1 (continued) 


Homologous gene 


Bacillus subtilis 168 pbuX 


Corynebacterium sp. ATCC 
31090 






Streptomyces griseus IF013189 
rpmA 


Streptomyces griseus IF013189 
obg 


Escherichia coli K12 rne 








Streptomyces coelicolor A3(2) 
SCF76.08c 


Corynebacterium glutamicum 
ATCC 31831 


Streptomyces coelicolor A3(2) 
SCF76.08C 


Streptomyces coelicolor A3(2) 
SCF76.09 


Mycobacterium smegmatis ndk 




Deinococcus radiodurans R1 
DR1844 |; 


Mycobacterium tuberculosis , 
H37RvRv1883c 


Mycobacterium tuberculosis 
H37Rv Rv2446c 




db Match 


sp:PBUX_BACSU 


pir;l40838 






sp:RL27_STRGR 


< 

CO 

to 

CM 

o 

CO 
CM 

t' 

CL 


sp:RNE_ECOLI 








gp:SCF76_8 


CO 

to 

CO 

cn 

'5. 


gp:SCF76_8 


to 

LU 

O 

to 

ol 
cn 


i 

in 
cn 
to 
o 

LU 
< 
CL 
Ol 




o 

I 

CM 
O 
CN 
O 
O 
LU 
< 

cL 
cn 


pir:H705l5 


CO 

to 

CO 

o 
r— 
LU 

'a. 






u 


1887 


cn 

CO 


CM 

to 


to 
co 

CO 


to 

CM 


CO 

o 
ro 


2268 


CO 

to 


ro 
r- 

if) 


r*- 
r»- 


o 
to 


1308 


co 
r- 
co 


o 
tn 


co 
o 


o 
to 

CO 


CN 

ro 


m 
to 


CO 
CN 


45 




Terminal 
(nt) 


2501669 


2501735 


2503355 


2504265 


2503984 


2504300 


2504831 


2507663 


2507710 


2508840 


2509530 I 


2509523 


2511423 


2511876 


2511949 


2512409 


2513144 


2513154 


2513692 


50 




Initial 
(nt) 


2499783 


2502577 


2502735 


2503870 


2504247 


CM 
O 

to 
■*r 
o 

IT) 
CM 


2507098 


2507115 


2507138 


2508094 


2508922 


2510830 


2511046 


2511427 


2512356 


2512768 


2512803 


2513618 


2514114 






SEQ 

NO. 
(a.a.) 


6085 


6086 


6087 


6088 


6089 


6090 


6091 


6092 


6093 


T 
CO 
O 

to 


6095 


9609 


6097 




6098 


6099 


6100 


6101 


6102 


6103 


55 




SEQ 
NO. 
(DNA) 


2585 


2586 


2587] 


[2588 


2589 


2590 


2591 


2592 


!co 

iS 

CM 


2594| 


2595 


2596 


2597 


2598 


2599 


2600 


2601 


2602 


2603 
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0) 



Function 


folyl-polyglutamate synthetase 








valyl-tRNA synthetase 


oligopeptide ABC transport system 
substrate-binding protein 


heat shock protein dnaK 


lysine decarboxylase 


malate dehydrogenase 


transcriptional regulator 


hypothetical protein 


vanillate demethylase (oxygenase) 


pentachlorophenol 4- 
monooxygenase reductase 


transport protein 


malonate transporter 


class-Ill heat-shock protein or ATP- 
dependent protease 


hypothetical protein 


succinyl CoA:3-oxoadipate CoA 
transferase beta subunit 


succinyl CoA:3-oxoadipate CoA 
transferase alpha subunit 


Matched 
length 
(a.a) 










in 
cn 


CN 

in 


CO 

o 
in 


o 
r-- 


cn 

CO 


r- 
O 
CN 


co 

o 

CN 


r- 
m 

CO 


OO 
CO 
CO 




to 

CO 
CN 


o 

CO 


to 

CO 
CO 


o 

OJ 


in 

CN 


Similarity 


79.6 








72.1 


58.5 


O) 

m 


71.2 


76.5 


56.5 


51.4 


68.6 


59.2 


76.8 


58.4 


85.8 


73.0 


r— 
tri 

CO 


84.5 


Identity 

'(%) 


tri 
>n 








in 
tri 

■\T~ 


CN 

fN 


26.2 


42.9 


56.4 


to 

CN 


26.0 


to 

CO 


CO 

-OS- - 
CO 


40.8 I 


p 

■aj" 
CN 


59.8 


CO 

cri 


CO 
CO 
CO 


60J2 


Homologous gene 


Streptomyces coelicolor A3(2) 
folC 








Bacillus subtilis 168 balS 


Bacillus subtilis 168 oppA 


Bacillus subtilis 168 dnaK 


Eikenella corrodens ATCC 
23824 


Thermus aquaticus ATCC 33923 
mdh 


Streptomyces coelicolor A3(2) 
SC4A10.33 


Vibrio cholerae aphA 


Acinetobacter sp. vanA 


Sphingomonas flava ATCC 
39723 pcpD 


Acinetobacter sp. vanK 


Klebsiella pneumoniae mdcF 


Bacillus subtilis ctpX 


Streptomyces coelicolor A3(2) 
SCF55.28c 


Streptomyces sp. 2065 pcaJ 


Streptomyces sp. 2065 peal 


db Match 


prf:2410252B 








sp:SYV_BACSU 


pir:A38447 


sp:DNAK_BACSU 


gp:ECU89166_1 


_i 

LL 
LU 
X 

Q 
cL 

1/1 


gp:SC4A10_33 


gp:AF065442_1 


prf:2513416F 


gp:FSU12290_2 


O 

CO 
m 

CN 

tf 

O. 


gp:KPU95087_7 


< 

T 

r>- 

CM 
CO 
O 
CO 
CN 

t: 

CL 


CO 

N l 

in 
to 

LL 

o 

CO 
CL 
O) 


gp:AF109386_2 


to 

CO 
CO 
CD 

o 

LL 
< 
CL 

cn 


ORF 

fhn\ 
t u P/ 


1374 


CN 

to 


r- 


CO 

to 
to 


2700 


1575 


1452 


m 

OO 

in 


co 
cn 


r- 
r- 
r^- 


to 
r- 
in 


1128 


tn 
r*- 
cn 


1425 


o 

CO 

cn 


1278 


1086 


co 

CO 

to 


o 
in 


Terminal 
(nt) 


2514114 


2516273 


2516956 


2517751 


2515637 


2518398 


2521660 


2521667 


2522265 


2524337 


2524340 


2526226 


2527207 


2528559 


2528551 


2529484 


2531976 


2531969 


2532604 


Initial 
(nt) 


2515487 


2515662 
2516243 


2517089 


2518336 


2519972 


2520209 


2522251 


2523248 


2523561 


2524915 


2525099 


2526233 


2527135 


o 

CO 

cn 

CN 

tn 

CN 


2530761 


2530891 


2532601 


2533353 


So "I o 

to 2 -2- : to 


o | o 
to ! to 


6107 


6108 


6019 


6110 


6111 


6112 


6113 


6114 


6115 


6116 


6117 


6118 


6119 


6120 


6121 J 


6122 


LU O 2 

to 2 o 


•c 

o 
to 

CN 


in : to 
o 1 o 
to to 

CN OJ 


2607 


2608 


2609 


[2610 


2611 


2612 


2613 


2614 


2615 


2616 


2617 


2618 


2619 


2620 


CN 
CO 
CN 


2622 
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Function 


protocatechuate catabolic protein 


beta-ketothiolase 




3-oxoadipate enol-lactone hydrolase 
and 4-carboxymuconolactone 
decarboxylase 


transcriptional regulator 


3-oxoadipate enol-lactone hydrolase 
and 4-carboxymuconolactone 
decarboxylase 




3-carboxy-cis,cis-muconate 
cycloisomerase 


protocatechuate dioxygenase alpha 
subunit 


protocatechuate dioxygenase beta 
subunit 


hypothetical protein 


muconolactone isomerase 




muconate cycloisomerase 




catechol 1,2-dioxygenase 




toluate 1,2 dioxygenase subunit 


Matched 
length 
(aa) 


in 

CM 


CO 

o 

*T 




to 
in 

CM 


LO 
CN 
OO 


to 




co 


CN 


CN 


CO 

r-- 

CN 


CN 
CO 




CN 
CO 




m 

CO 

CN 




co 


Similarity 
(%) 


82.5 


cn 




76.6 


43.0 


89.6 




63.4 


70.6 


91.2 


48.7 j 


81.5 




84.7 




co 

CO 




85.6 


Identity 
'(%) 


1 58.2 


CO 




CO 

p 


23.6 


78.3 




39.8 


in 

_rr> 


1"- 


CN 


V 

to 




CO 

to 




72^3 




CM 
CM 
CO 


Homologous gene 


Rhodococcus opacus 1CP pcaR 


Ralstonia eutropha bktB 




Rhodococcus opacus pcaL 


Streptomyces coelicolor A3(2) 
SCM1.10 


Rhodococcus opacus pcaL 




Rhodococcus opacus pcaB 


Rhodococcus opacus pcaG 


Rhodococcus opacus pcaH 


Mycobacterium tuberculosis 
H37Rv Rv0336 


Mycobacterium tuberculosis 
catC 




Rhodococcus opacus 1CP catB 




Rhodococcus rhodochrous catA 




Pseudomonas putida plasmid 
pDK1 xylX 


db Match 


Ul 
rr 
CN 
CO 
CO 

o 

CN 

*t: 
a. 


prf:2411305D 




prf: 2408324 E 


gp:SCM1_10 


LU 

•^r 

CN 
CO 

CO 

o 

CN 

tf 
CL 




a 

CN 
CO 
CO 
O 

CN 

t: 
Cl 


o 

CN 
CO 

co 

O 
T 
CN 

t: 

CL 


m 

CN 
CO 
CO 

o 

T 
CNI 

tf 
CL 


pir;G70506 


prf:2515333B 




sp:CATB_RHOOP 




prf:2503218A 




CO 
*r 
co 

ro 

LL 
< 
CL 

cn 




CN 

cn 

r-. 


1224 


CN 

cn 


co 
tn 


2061 


to 
to 

CO 


00 

to 


1116 


CN 

to 


o 

CO 

to 


1164 


cn 

CN 




1119 


to 
o 
to 


in 
in 

CO 




1470 


Terminal 
(nt) 


2534182 


2535424 


2534257 


2536182 


2538256 


2538248 j 

! 


2540230 


2538616 


2539709 


2540335 


2541187 


2542512 


2543813 


2542818 


2544867 


2544022 


2544928 


2546784 


Initial 
(nt) 


2533391 


2534201 


2535168 


2535430 


2536196 


2538613 


2539553 


2539731 


2540320 


2541024 


2542350 


2542802 


2543043 


2543936 


CN 

SP 

CN 

*r 
in 

CN 


2544876 


2545068 


2545315 


So <j 

W 2 5 

L 


6123 


6124 


6125 


6126 


6127 


6128 


6129 


6130 


6131 


6132 


6133 


CO 

to 


6135 


61361 


6137 


6138 


6139 


6140 


SEQ 
NO. 
(DNA) 

2623 


in 

CN CN 

to CO 

CN I CN 

1 


2626 


2627 


2628 


2629 


2630 


2631 


CN | 
CO j 
CO 
CN 


2633 


2634 


2635 


2636 


2637 


2638 


2639 


2640 
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TO OJ — 

5 * 



J2 3" 
to 



O 



b » 



l CO ^ ^ 



55 



CO 2 Q 



i & 

CO O 

CL Cl 



ro 



O) 
O 
CN 



I £ 
■a 

</) Q 
CL Q. 



^ 1 



C 
O) 
X3 

CO ro 
« £ 

J= P 
O 

u 

>^-- 

o a) 
.22 



55 



CM 



if 

to Q 

cl a. 



o ^ 



3 -o 



<2 



2- 

0) 



3 



2£ 

CL O 

I s 

c *- 

£E ~ 

~ ro 

0} o 

C N 

ro c 

^ x" 

E S 

TO 



< 5. 



< 
o 
o 

< 
u 
a 



m 5; 



CM 
CO 
CD 



O 0) 



< 
o 

3 



UJ 
CD 



a. 

Q.CM 



<u o 

CL V 
fll ^» 



00 o 



< 

CL 



0) 

o 

Q. 

si 

0) u 
"D "5 

f- 2 

< CL 



>% 


c 


o 


"a> 


O- 


o 




Cl 


o 
T> 


<d 
c 


*2 


o 




<B 


CD 


CL 


cn 


ro 






o 



E 

o co 

a> O 

-b O 

CO CO 



CN 

a 

o 

CO 



_CN 

m 



o 



O 
E 



o 



Z "CL 



o 

(J 



c 

o 
O 



5 
o 
o 

I 

CL 

m 
Cl 



O 

CN 

o 



o 

CO 

to 



co • CO 



to in 
to i to 



cn i lo 
to to 

cn r ■ 



m 
co 

CN 



CO 
CN 
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Function 






galactose-6-phosphate isomerase 


hypothetical protein I 


hypothetical protein 


aminopeptidase N 


hypothetical protein 








phytoene desaturase 






phytoene dehydrogenase 


phytoene synthase 


multidrug resistance transporter 




ABC transporter ATP-binding protein 


dtpeptide transport system 
permease protein 


nickel transport system permease 
protein 




Matched 
length 
(aa) 






o 


CO 
*T 

CN 


Ol 

cr> 


o 

Ol 
CO 


CO 

in 
ro 








o 






CD 
CO 


o 

Ol 
CN 


CN 
Ol 
CO 




CO 
CO 

in 


to 
co 

CN 


CO 

CO 
















































Similar 






r-- 


58.1 


80.9 


70.5 


58.1 








j 81.7 






63.8 


58.6 


47.7 




71.6 


73.8 


62.0 




>» 






o 


2)3.2 I 


5(5.8 


to 


•si- 
CN 








61.5 






31.2 


31.4 


CO 
LO" 
CN 




CO 


38)8 


33.2 




Homologous gene 






Staphylococcus aureus NCTC 
6325-4 lacB 


Bacillus acidopullulyticus ORF2 


Mycobacterium tuberculosis , 
H37Rv Rv2466c 


Streptomyces lividans pepN i 


Borrelia burgdorferi BB0852 








Brevibacterium linens ATCC 
9175 crtl 






Myxococcus xanthus DK1050 
carA2 


Streptomyces griseus JA3933 
crtB 


Listeria monocytogenes lltB i 




Synechococcus elongatus 1 


Bacillus firmus OF4 dppC 


Escherichia coli K12 nikB 




db Match 






sp:LACB_STAAU 


sp:YAMY_BACAD 


pir:A70B66 


-j 
or 
y— 

CO 

i 

a. 
< 

CL 
CO 


pir:B70206 








ro 

Ol 
Ol 
CO 

LL 
< 
CL 
Ol 






sp:CRTJ_MYXXA 


sp:CRTB_STRGR 


CO 

rJ 

CN 
CO 

o> 

— > 
< 

_J 
CL 
Ol 




IN 

a' 

CD 

a 
\~ 

>- 

CO 
Cl 
Ol 


sp:DPPC_BACFI 


pir:S47696 




n 


o 

Ol 
CO 


m 

CO 
CO 


T 


to 
Ol 

to 


Ol 

o 
to 


2601 


1083 


1152 


CO 
CD 
CD 


to 
m 


r~ 
CM 

ro 




CO 

r- 
co 


1206 


CO 

CO 


1119 


1233 


1641 


CN 
CO 
CO 


Ol 
CO 

Ol 


1707 


Terminal 
(nt) 


2562387 


2563847 


2563932 


2564550 


2565623 


2568945 


2570293 


2570309 


2572175 


2572348 


' 2572351 


2572807 


2573393 


2572659 


2573843 


2574780 


2575981 


2577232 


2578879 


2579769 


2580711 


Initial 
(nt) 


2562776 


2562963 


2564402 


2565245 


2566231 


2566345 


2569211 


2571460 


2571510 


2572193 


r- 
to 

CN 
r*- 

LO 
CN 


2572977 


;2573770 


2573864 


2574718 


2575898 


2577213 


2578872 


2579760 


2580707 


2582417 


co ^ S, 


6159 


6160 


6161 


6162 


6163 


6164 


6165 


6166 


6167 


6168 


6169 


6170 


6171 


6172 


6173 


6174 


6175 


6176 


6177 


6178 


oi ; 
r- ; 

T , 

CO ; 


SEQ 
NO. 
(ONA) 


2659 


2660 


2661 


2662 


2663 


2664 


2665 


2666 | 


26671 


2668 , 


2669 


2670 


2671 


2672 


2673 


2674 


(2675 


2676 


2677 


2678 


oi i 
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Function 




acetylornithine aminotransferase 


hypothetical protein 


hypothetical membrane protein 


acetoacetyl CoA reductase 


transcriptional regulator, TetR family 


polypeptides predicted to be useful 
antigens for vaccines and 
diagnostics 


ABC transporter ATP-binding protein 


globin 


chromate transport protein 


hypothetical protein 


hypothetical protein 




hypothetical protein 


ABC transporter ATP-binding protein 


hypothetical protein 


hypothetical membrane protein 


alkaline phosphatase 


Matched I 
length 
(a.a) 






CM 
00 

■<r 


218 


to 
ro 

CN 


o 

CN 


T 
CD 


CO 
CO 
CN 


CO 
CM 


to 

o> 

CO 


CO 
CD 


CN 




in 
in 


CO 
CO 

m 


CM 


o 
o 
r->- 


CO 
CO 

m 


Similarity 
(%) 




63.5 


47.9 


79.4 


60.0 


55.0 I 


47.0 


65.1 


77.0 


60.4 


68.9 


61.4 




60.0 


79.6 


62.2 


56.7 


52.6 


Identity 




CO 


uV 
CM 


49.1 


"CLT 
CN 


26.7 


38.0 


31.1 


CM 
CO 
LO 


CO 
CM 


CO 

r--' 
co 


CN 
CO 
CO 




. 

CO 
CO 


52.8 | 


CO 


28.0 | 


28.0 


Homologous gene 




Corynebacterium glutamicum 
ATCC 13032 argD 


Mycobacterium tuberculosis 
H37Rv Rv1128c 


Mycobacterium tuberculosis 
H37Rv Rv0364 


Chromatium vinosum D phbB 


Streptomyces coelicolor actll 


Neisseria meningitidis 


Pseudomonas putida GM73 
ttg2A 


Mycobacterium leprae 
MLCB1610.14c 


Pseudomonas aeruginosa 
Plasmid pUM505 chrA 


Mycobacterium tuberculosis 
H37Rv Rv2474c 


Streptomyces coelicolor A3(2) 
SC6D10.19C 




Aeropyrum pernix K1 APE1182 


Escherichia coli K12 yjjK 


Mycobacterium tuberculosis 
H37Rv Rv2478c 


Mycobacterium leprae o659 


Bacillus subtilis phoB 


db Match 




sp:ARGO_CORGL 


pir:A70539 


t- 

O 
>- 

to' 

CM 
< 
> 

id. 

t/i 


sp:PHBB_CHRVI 


CD 

O 
O 

< 

l: 

Q. 


m 
r- 
co 

T 

> 
CL 
00 

CD 


cn' 
o 
o 

CO 

o 

LL 

< 
CL 
CO 


? 

CO 
CD 

o 

_J 
ol 


LU 
< 
LU 
CO 
CL 

<* 

or 

X 

o 

i/> 


pir:A70867 


gp:SC6D10_19 




CO 

m 

CM 

h- 

CD 

LJ 
CL 


_j 
o 
o 

LU 

> 

id 
to 


r— 

CO 
CO 

o 
r- 
LU 

'5. 


sp:Y05L_MYCLE 


to 
r-» 
to 
cn 
to 
O 

Q_ 


LL . — 


1941 


1314 


1584 


r- 
f- 


CO 

o 
r» 


CO 

CO 

r- 




CM 
CD 


CO 
CO 
CO 


1128 


CN 
CO 


LO 

to 


CN 
CO 


CN 

to 


1668 


m 

CO 


2103 


1419 


Terminal 
(nt) 


2584504 


2585926 


2587763 


2588722 


2588725 


2590302 


2591137 


2591574 ' 


2592794 


2593965 


2593968 


2594597 


2595188 


2595822 


2596048 


2597869 


2598662 


2602879 


Initial 
(nt) 


2582564 


2584613 


25B6160 


2587976 


2589432 


2589565 


2590697 


2592365 


2592402 


2592838 


2594594 


2595061 


2595808 


2595983 


2597715 


2598483 


2600764 


2601461 


SEQ 

NO. 
(a.a.) 

6180 


6181 


6182 


6183 


6184 


6185 


6186 


6187 


6188 


6189 


6190 


6191 


6192 


6193 


cn 
to 


6195 


6196 


6197 


SEQ 
NO. 
(DNA) 


2680 


2681 


2682 


2683 


2684 


I 2685 


2686 


2687 


2688 


2689 


2690 


2691 


2692 


2693 


2694 


2695 


2696 


2697 
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QJ £ , 

CO <D — ■ 

2 - 



E - 
to 



w 2 5. 



So § 

y) Z Q 



2 — 

?° 
=5 °- 
c o> 

5 8 

CD t 
« CL 
- E 

.9- £ 



CO 
CD 



3 E 

CCD 



CN 
O 
CN 



u> to 

3 E 

St 

B cc 
cm 
£ O 

CO? 



CO 

2 



5lu 
o >•« 
o3 E 
ro ra 

S3 



o>g o 

C Q- Q. 



Q_ C 01 

< ~ s 

*c g> E 

O ^ O) 
CX V) 
in <U o 



2 -V J 

^ CO <-> 

< ^ O Q. 



O0 



< 

CD 



o £ 

o _ 



O 

E 



O co 



3 

o 

o 
u 
o 

h 

CO - 



5 s 



E ^ 



•S > 

_q > 
o rr 

£5 



w j 
to ! 
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Function 


ferric enterochelin esterase 


lipoprotein 








transposase (IS1207) 






transcriptional regulator 


glutaminase 


sporulation-specific degradation 
regulator protein 




uronate isomerase 




hypothetical protein 


pyrazinamidase/nicotinamidase 


hypothetical protein 


bacterioferritin comigratory protein 


bacterial regulatory protein, tetR 
family 


15 


Matched 
length ' 
(a.a) 


in 


CO 
CO 
CO 








to 

CO 






CO 


CO 

m 

ro 


co 




m 

CO 
CO 




CO 
CN 


to 

CO 


in 
















































20 


Simi/ari 


50.9 


71.9 








99.8 






63.4 


69.3 


72.2 




CO 

0 

CO 




45.0 


74.6 


80.0 


73.8 


61.4 




Identity 


26.0 


48.5 








9915 






CO 

CN 
CO 


35.2 


CO 
CN 




29.0 




32.0 


cO 


CN 


46. fl | 


CN 
CO 


CO to N) 
Cn O Oi 

Table 1 (continued) 


Homologous gene 


Salmonella enterica iroD 


Mycobacterium tuberculosis 
H37RvRv2518clppS 








Corynebacterium glutamicum 
ATCC 21086 






Salmonella typhimurium KP1001 
cytR 


Rattus norvegicus SPRAGUE- 
DAWLEY KIDNEY 


Bacillus subtilis 168 degA 




Escherichia coli K12 uxaC 




Zea diploperennis perennial 
teosinte 


Mycobacterium avium pncA 


Mycobacterium tuberculosis 
H37Rv Rv2520c 


Escherichia coli K12 bcp 


Streptomyces coelicolor A3(2) 
SCI11.01C 


40 


db Match 


prf;2409378A 


pir:C70870 








gp:SCU53587_1 






gp:AF085239_1 


sp:GLSK_RAT 


0 

* -c- 

O) 
CO 

CO 

< 

Q. 




sp:UXAC_ECOLI 




0 

CN 

m 

CO 

t: 

CL 


< 

CN 
CO 
CN 

Q. 


pir:E70870 


_! 
O 
O 
LU 

I 

CL 

O 
CO 

CL 

to 


0 

CO 

CL 

cn 




si 


1188 


1209 


LO 

to 


0 
m 


tD 
CN 


1308 


r- 
0 

CN 


CO 

ro 
CO 


CO 

m 


1629 


r- 


m 
in 
in 


1554 


in 


1197 


CO 

in 
in 


CO 

r*- 

CN 


m 

CO 

t 


to 

CO 
CD 


45 


Terminal 
(nt) 


2619541 


2620973 


2623605 j 


2623621 


2624048 


2624051 


2625806 


2625809 


2628376 


2626493 


2628852 


2628324 


2630479 


2631136 


2632466 


2633100 


2633146 


to 
0 

CO 

to 

CN 


2634751 


50 


Initial 
(nt) 


2620728 


2622181 


2622961 


2623770 


2623803 


2625358 


2625600 


r- 
-<r 
"3- 
co 

CN 
CO 
CN 


2627924 


2628121 I 


2628376 


2628878 


2628926 


2630636 


2631270 


2632543 


2633418 


2633600 


2634116 






6216 


6217 


6218 


6219 


6220 


6221 


6222 
6223 


6224 


6225 


6226 


6227 


6228 


6229 


6230 


6231 


6232 


6233 


6234 


55 


SEQ 
NO. 
(DNA) 


2716 


2717 


2718 


2719 


2720 ! 


2721 


2722 


CO 
CN 

R 


2724 


2725 


2726 


2727 


2728 


2729 


2730 


! 2731 


2732 


2733 


2734 
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Function 


phosphopantethiene protein 
transferase 


lincomycin resistance protein 


hypothetical membrane protein 




fatty-acid synthase 


hypothetical protein 


peptidase 


hypothetical membrane protein 


hypothetical membrane protein 


hypothetical protein 


ribonuclease PH 








hypothetical membrane protein 


transposase (IS1628) 




arylsulfatase 


15 


Matched 
length 
(aa) 


LO 


CO 

r*. 

T 


CO 




3029 


o 


o 

CO 
CN 


CM 


ro 


CM 

o 

CM 


CO 
CO 
CN 








CO 
CM 
XT 


LO 

r-- 




o 

LO 
CN 










































20 


Similar) 


75.9 


85.6 


54.0 




83.6 


55.2 


60.9 


67.9 


69.0 


76.7 


TT 
CO 








58.2 


97.2 




TT 

r- 












































QJ 


56.6 


T 

in 


30.1 




CO 

CM" ~ 
CD 


25.3 


40.4 


40.2 


37! 2 


55|0 


60)2 








o 
o> 

CN 


92.1 




cb 


CO CO M 

Cn o oi 

Table 1 (continued) 


Homologous gene 


Corynebacterium 
ammoniagenes ATCC 6871 ppt1 


Corynebacterium glutamicum 
ImrB 


Synechocystis sp. PCC6803 




Corynebacterium 
ammoniagenes fas 


Streptomyces coelicolor A3(2) 
SC4A7.14 


Mycobacterium tuberculosis 
H37Rv Rv0950c 


Mycobacterium tuberculosis 
H37Rv Rv1343c 


Mycobacterium leprae 
B1549_F2_59 


Mycobacterium tuberculosis 
H37Rv Rv1341 


Pseudomonas aeruginosa 
ATCC 15692 rph 








Mycobacterium tuberculosis 
H37Rv SC8A6.09C 


Corynebacterium glutamicum 
22243 R-plasmid pAG1 tnpB 




Mycobacterium leprae ats 


40 


db Match 


gp:BAY15081_1 


gp:AF237667_1 


pir:S76537 




pir;S2047 


rJ 
< 

o 

CO 

d. 
cn 


pir:D70716 


h- 
o 
> 

5 

rJ 
r- 
o 
> 
cL 

crt 


LU 
— 1 

o 
>- 

CD' 

o 

>- 

Ol 
to 


sp:Y03Q_MYCTU 


LU 
< 
LU 
CO 

a 

X 

a 

CC 
b_ 

CO 








1- 
O 
> 

o' 

CM 
O 
> 
CL 

to 


gp:AF121000_8 




sp:Y03O_MYCLE 




ORF 


tn 
o 


1425 


CN 
CO 


^ 


8979 


1182 


in 

CO 


CM 
CO 


LO 

ro 


eo 

CD 


to 

CO 

r*- 


CD 

•*r 

CM 


CO 
CTi 
CD 


CN 
CO 

to 


1362 


co 

LO 


o 

CD 
CD 


in 

CO 

r- 


45 


Terminal 
(nt) 


2634747 


2635165 


2637168 


2637240 


2638649 


2648235 


2650164 


2650902 


2651339 


2651420 


2652067 


2653009 


2653326 


2654079 


2654875 


2656985 


2656974 


2657736 


50 


Initial 
(nt) 


2635151 


2636589 


2636845 


2637653 


2647627 


2649416 


2649550 


2650441 


2650986 


2652037 


2652801 


2653254 


2654018 


2654660 


2656236 


2656452 


2657633 


2658500 




CO 2 " 


6235 


6236 


6237 


6238 


6239 


o 

CN 
CO 


6241 


- ! 

CM 

CO I 


co 

CM 
CO 


6244 


6245 


6246 


r- 

CM 
CO 


6248 


6249 


6250 


6251 


6252 [ 


55 


SEQ 

NO. 
(DNAJ 


2735 


2736 


2737 


2738 


2739 


2740 


r-- 

CN 


2742 


2743 


2744 


2745 


CO | 
CN 


r- 
%r 
r- 1 

«*, 


CO 
T 

r-- 

CN 1 


Ol j 

ft | 


2750 


2751 


2752j 
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50 



55 



Function 


D-glutamate racemase 




bacterial regulatory protein, marR 
family 


hypothetical membrane protein 




endo-type 6-aminohexanoate 
oligomer hydrolase 


hypothetical protein 


hypothetical protein 




hypothetical protein 




ATP-dependent helicase 


hypothetical membrane protein 


! hypothetical protein 


phosphoserine phosphatase 




cytochrome c oxidase chain I 




Matched 
length 
(aa) 


■*r 
co 
Oi 






to 

OJ 
OJ 




CN 
CO 


o 
o 

OJ 


to 
o 




CO 
OJ 
T 




r-» 

CO 


CO 
CO 


Ol 
OJ 
OJ 


O 

CO 




in 

fx- 

m 




Similarity 
(%) 


99.3 




70.8 


69.3 




58.3 


58.5 


77.1 




CO 

o 

CO 




53.3 


60.1 


52.0 


61.0 




74.4 




>» 


99.3 




OJ 


38.2 




30.2 


o 

LO 
CO 


in 




CN 
CO 




to 

OJ 


i> 
ai 

OJ 


39.0 


hr. _ 
CO 
CO 




46.B 




Homologous gene 


Corynebacterium glutamicum 
ATCC 13869murl 




Streptomyces coelicolor A3(2) 
SCE22.22 ' 


Mycobacterium tuberculosis 
H37RvRv1337 




Flavobacterium sp. nylC 


Mycobacterium tuberculosis 
H37RvRv1332 


Mycobacterium tuberculosis 
H37Rv RV1331 




Mycobacterium tuberculosis 
H37Rv Rv1330c 




Escherichia coli dinG 


Mycobacterium tuberculosis 
H37Rv Rv2560 


Streptomyces coelicolor A3(2) 
SC1B5.06c 


Escherichia coli K12 serB 




Mycobacterium tuberculosis 
H37Rv Rv3043c 




db Match 


prf:2516259A 




gp:SCE22_22 


t— 
O 
> 

1 

CO 

o 
>- 

CL 

</> 




cn 

CO 

o 

3 

u* 

CL 


sp:Y03H_MYCTU 


Z) 
r— 

o 

i 

O 

CO 

o 
>■ 

CL 
</) 




r) 
h- 
o 
> 

CO 
O 
>■ 

CL 
Wl 




prf:1816252A 


sp:Y0A8_MYCTU 


pir:T34684 


sp;SERB_ECOLI 




pir:D45335 




n 


OJ 

co 


to 

CO 
tO 


OJ 

cn 


f- 


CO 
CO 


O 
CO 

cn 


r- 
co 
m 


o 
o 

CO 


Ol 
CO 


1338 


CO 

o 

CO 


1740 


cn 

CO 


CO 
OJ 


1017 


1596 


1743 


to 
o 

CO 


Terminal 
(nt) 


2658606 


2660131 


2660147 | 


2660671 


2662455 


2661417 


2662331 1 


2662883 


2664060 


2665397 


2665992 


2667854 


2667870 


2668839 


2669557 


2672721 


2671063 


2673255 


Initial 
(nt) 


2659457 


2659496 


J 2660638 


j 2661417 


2661565 


1 2662376 


2662867 


2663182 


2663437 


2664060 


2665687 


2666115 


2668760 


2669561 


2670573 


2671126 


2672805 


2672950 | 


SEQ 
NO. 
(a a.) 


6253 


6254 


6255 


6256 


6257 


6258 


6259 


6260 


6261 


6262 


6263 


6264 


6265 


6266 


6267 


6268 


6269 


6270 j 


SEQ 
NO. 
(ONA) 


2753 


T 

to 
r- 

OJ 


2755 


2756 


2757 


2758 


2759 


2760 


[ 2761 


2762 


2763 


2764 


2765 


2766 


2767 


2768 


2769 


o 
r-- 
r-~ 

Ol 



181 



EP1 



108 790 A2 



5 
10 


Function 


ribonucleotide reductase beta-chain 


ferritin 


sporulation transcription factor 


iron dependent repressor or 
diptheria toxin repressor 


cold shock protein TIR2 precursor 


hypothetical membrane protein 


ribonucleotide reductase alpha- 
chain 




SOS ribosomal protein L36 


NH3-dependent NAD(+) synthetase 






hypothetical protein 


hypothetical protein 


alcohol dehydrogenase 


Bacillus subtilis mmg (for mother cell 
metabolic genes) 


hypothetical protein 




phosphoglucomutase 


15 


Matched 
length 
(aa) 


CO 

CO 


cn 
tn 


to 
in 

CN 


m 

CN 
CN 


CN 


o 

tn 


o 






o 

CM 






in 

CN 


to 


co 

CO 


CO 

m 


CO 
CN 




to 
m 
in 












































20 


Similari 
(%) 


99.7 


64.2 


60.2 


60.4 


62.1 


86.0 


100.0 




79.0 


78 : 1 






56.4 


68.8 


52.8 


56.0 


66.2 




80.6 




Identity 
(%) 


99.7 


in 

CO 


32.8 i 


CO 
CN 


CN 

*r 

CN 


50j0 


99I9 




CO 

in 


_CO„ 

in 
m 






o 

CO 




to 

CN 


27.0 


33.8 | 




61.7 | 


25 

OJ 
13 

o 
o 

30 ^ 

0) 
X3 

35 


Homologous gene 


Corynebacterium glutamicum ! 
ATCC 13032 nrdF 


Escherichia coli K1 2 ftnA i 


Streptomyces coelicolor A3{2) 
whiH 


Corynebacterium glutamicum 
ATCC 13869 dtxR 


Saccharomyces cerevisiae 
YPH148 YOR010C TIR2 


Archaeoglobus fulgidus AF0251 


Corynebacterium glutamicum 
ATCC 13032 nrdE 




Rickettsia prowazekii 


Bacillus subtilis 168 nadE 






'Synechocystis sp. PCC6803 
slr1563 


Mycobacterium tuberculosis 
H37Rv Rv3129 


Bacillus stearothermophilus 
DSM 2334 adh 


Bacillus subtilis 168 mmgE 


Arabidopsis thaliana T6K22.50j 




Escherichia coli K1 2 pgm 


40 


db Match 


CD 
CO 
UO 
CN 

< 
cl 

Dl 


sp:FTNA_ECOU 


t 

X 
X 

CO 

< 
O 
GO 
Cl 
Ct> 


o> 

CO 
CO 

o 
2! 

"cL 


h- 

c7) 
< 
LU 
> 

I 

CM 

at 
t- 

CL 

tn 


pir:C69281 


gp:AF112535_3 




SP:RL36_RICPR 


sp:NADE_BACSU 






pir:S76790 


pir:G70922 


sp:ADH2_BACST 


Z) 
CO 

O 

< 
m 

i 

LU 

cL 

v> 


r-- 

m 
o 
r- 
i_ 
*o_ 




sp:PGMU_ECOL! 




SI 


1002 


to 

CO 


O 
tn 
r-. 


o 

to 
to 


CO 
CO 


to 

CM 


2121 


in 

CO 




CO 
CO 


CO 
CD 


CO 

o> 


r*- 
r- 


cO 
CO 
CM 


1020 


1371 


CO 
CO 


CM 

o> 
r- 


1662 


45 


Terminal 
(nt) 


2673338 


2675289 


2676240 


\ 2676243 


2677377 


2676918 


2677478 


2680784 


2681223 


2682376 


2681464 


; 2683616 


2682379 


2683131 

i 


2683627 


2686289 


2687148 


2687449 


2688389 


50 


Initial 
(nt) 


2674339 


2674804! 


2675491 


1 2676902 


2676940 


2677193 


2679598 


2680470 


26813631 


12681546 


2681556 


2683119 


2683125 


2683418 


1 2684646 


2684919 


2686315 


2688240 


2690050 




go 2 « 


6271 


6272 


6273 


1 CM 

t <o 


6275 


6276 


6277 


6278 


6279 


6280 I 


6281 


6282 


6283 


6284 


6285 


6286 


6287 


6288 


6289 


55 


SEQ 
NO. 
(DNA) 


2771 


2772 


2773 


r-. 

CN 


2775 


2776 


2777 


j 2778 


2779 


2780 


2781 


2782 


2783 


2784 


2785 


2786 


2787 


I 2788 


cn 

CO 

r*- 

CM 
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Function 


hypothetical membrane protein 


hypothetical membrane protein 


hypothetical protein 


transposase (IS1676) 


major secreted protein PS1 protein 
precursor 








transposase (IS1676) 




proton/sodium-glutamate symport 
protein 




ABC transporter 




ABC transporter ATP-binding protein 


hypothetical protein 


hypothetical protein 




oxidoreductase or dehydrogenase 


Matched 
length 
(aa) 


CO 


CN 
CN 


in 

CN 


CD 
CO 
^1" 


in 
in 

CO 








o 
o 
m 




CO 
CO 




CO 

co 




CO 

CN 


CO 


CN 




to 

CO 


Similarity 
(%) 


64.3 


61.5 


79.1 


48.6 


49.6 








46.6 




66.2 




0*69 




79.8 


67.0 


75.0 




LO 


Identity 
(%) 




CN 


CN 
LO 


24.2 


CO 

W" 
CN 








CO 
CN 




30.8 




33J0 




■*T _ 
LO 


60,0 


71.0 




28.1 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv3069 . 


Helicobacter pylori J99jhp1146 


Bacillus subtilis 168 ycsl ' 


Rhodococcus erythropolis 


Corynebacterium glutamicumj 
(Brevibacterium flavum) ATCC 
17965 cspl 








Rhodococcus erythropolis 




Bacillus subtilis 168 




Streptomyces coelicolor A3(2)!' 
SCE25.30 




Staphylococcus aureus 


Chlamydophila pneumoniae 
AR39 CP0987 


Chlamydia muridarum Nigg 
TC0129 




Streptomyces coIlinusTu 1892 
ansG 


db Match 


pir:F70650 


pir:D71843 


sp:YCSI_BACSU 


gp:AF126281_1 


sp:CSP1_CORGL 








CO 
CN 
CO 
CN 

LL 
< 
Cl 
O) 




< 
o 
o 
< 
m 

i 

t- 

r- 

_J 

CD 

CL 
l/l 




o 

CO 

1 

in 

CN 
UJ 

o 

CO 

Q. 

cn 




gp:SAU18641_2 


PIR;F81516 


r- 
eo 
h- 

CO 
LL 

c: 
cl 




prf:2509388L 


LL. <-— 

ge 


28B 


■v 

CN 
CO 


CN 

cn 

r- 


1365 


1620 


m 

CO 


in 

CO 




1401 


CO 
CD 
I s - 


1338 


CO 
O) 
CO 


2541 


CO 
CO 


CO 

o 


CO 
CN 




CO 

h- 
CO 


CN 

to 


Terminal 
(nt) 


2690437 


2690760 


2691564 


2693053 


2694918 


2695279 , 


2695718 j 


2695320 ; 


2697212 


2697383 


2698194 


2701612 


2699926 


2703356 


2702487 


2704586 


m 

CD 

o 
r-~- 
CN 


2710555 


2711308 


Initial 
(nt) 


2690150 


2690437 


2690773 


2691689 


2693299 


2694926 


2695554 


2695766 


2695812 


2698150 


2699531 


2700920 


2702466 


2702466 


2703194 


2704314 


2704835 


2709878 


CO 

to 
o 

Y-* 

CN 


SEQ 

NO. 
(a.a.) 


6290 
6291 


6292 


6293 


■^r 

CD 
CN 
CO 


6295 i 


6296 , 


6297 


6298 


6299J 


6300 


6301 


6302 


6303 


6304 
6305 


6306 


6307 


6308 


SEQ 
NO. 

(UNA) 

2790 

2791 
2792 
2793 


2794 


2795 


2796 


2797 


2798 


2799 


2800 


1 2801 


2802 


2803 


■^r in 
o | o 
eo | co 

CN | CN 


2806 


2807 


2808 



183 



EP 1 



108 790 A2 



5 
10 


Function 


methyltransferase 


hypothetical protein 


hypothetical protein 




UDP-N-acetylglucosamine 1- 
carboxyvinyltransferase 


hypothetical protein 


transcriptional regulator 




cysteine synthase 


O-acetylserine synthase 


hypothetical protein 


succinyl-CoA synthetase alpha 
chain 


hypothetical protein 


succinyl-CoA synthetase beta chain 




frenolicin gene E product 




succinyl-CoA coenzyme A 
transferase 


transcriptional regulator 


15 


Matched 
length ' 
(aa) 


m 
o 

OJ 


T 

co 


CN 






o 

CO 


CO 
OJ 




m 
o 

CO 


OJ 
r-- 


CO 
CO 


cn 

OJ 


in 


o 
o 




CO 
CN 




o 
m 


CN 
CO 


20 


Similarity 
(%) 


51.2 


0*99 


75.0 




75.3 


84.2 


69.0 




84.6 


79.7 


65.1 


79.4 


o 

CO 


73.0 




71.8 




77.8 


68.5 




Identity 


o 

CN 


61.0 


o 




CO 


66.3 


45,9 




m 


61.1 


36.1 


52.9 


o 

oT 


CO 

cn 

CO 




JjO 
CO 
CO 




cn 

T 


to 

CO 
CO 


CO CO 1 fO 

o cn 

Table 1 (continued) 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv0089 


Chlamydia pneumoniae 


Chlamydia muridarum Nigg 
TC0129 | 




Acinetobacter calcoaceticus 
NCIB 8250murA 


Mycobacterium tuberculosis 
H37RvRv1314c 


Streptomyces coelicolor A3{2) 
SC2G5.15c 




Bacillus subtilis 168 cysK 


Azotobacter vinelandii cysE2 


Deinococcus radiodurans R1 > 
DR1844 


Coxiella burnetii Nine Mile Ph I 
sucD 


Aeropyrum pernix K1 APE 1069 


Bacillus subtilis 168 sucC 




Streptomyces roseofulvus frnE 




Clostridium kluyveri cati cati 


Azospirillum brasilense ATCC 
29145 ntrC 


40 


db Match 


sp:Y089_MYCTU 


GSP:Y35814 


PIR:F81737 




< 
o 

o 
< 

<■ 

or 

Cl 
to 


sp:Y02Y_MYCTU 


gp:SC2G5J5 ' 




sp;CYSK_BACSU : 


prf:2417357C 


gp:AE002024 10 ! 


sp:SUCD_COXBU 


PIR:F72706 


sp:SUCC_BACSU 




gp:AF058302_5 




_j 

x. 
O 
_j 

o 

Cl 

to 


sp:NIR3_AZOBR 




SI 


LO 

CN 

m 


ro 

CN 




m 
cn 
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Function 


hypothetical protein 


hypothetical protein 


hypothetical membrane protein 


hypothetical protein 


5*-phosphoribosyl-N- 
formylglycinamidine synthetase 




5'-phosphoribosyl-N- 
formylglyctnamidine synthetase 


hypothetical protein 




gluihatione peroxidase 


extracellular nuclease 




hypothetical protein 


C4-dicarboxylate transporter 


dipeptidyl amtnopeptidase 
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Matched 
length 
(aa) 


CN 


m 

CO 


CN 


CM 


CO 
CD 
r- 




CO 
CM 
CM 


CO 
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CO 
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<^ 
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cn 

CO 
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Similari 


75.8 


94.0 


87.1 


71.0 


89.5 




93.3 


93.7 




77.9 


51.5 




68.7 


81.6 


70.6 






Identity 
(%) 


57.3 


75.9 


67.7 


64.0 


CD 
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r- 




80l3 


81.0 




_CN . 
CD 
T 


28.0 




CO 


49.0 


CO 
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30 
35 


Table 1 (continued) 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv0807 


Corynebacterium 
ammoniagenes ATCC 6872 
ORF2 


Corynebacterium 
ammoniagenes ATCC 6872 
ORF1 


Sulfolobus solfataricus 


Corynebacterium 
ammoniagenes ATCC 6872 
purL 




Corynebacterium 
ammoniagenes ATCC 6872 
purQ 


Corynebacterium 
ammoniagenes ATCC 6872 
purorf 




Lactococcus lactis gpo 


Aeromonas hydrophila JMP636 

nucH 




Mycobacterium tuberculosis 
|H37Rv Rv0784 


Salmonella typhimurium LT2 
dctA 


Pseudomonas sp. W024 dapbl 
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m 

CO 

o 
o 

CD 

< 

Cl 
CO 


CN 

o 

CO 
Cn 
OO 

5 

GO 
GO 

CL 


gp:AB003162_3 
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CN 
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45 




Terminal 
(nt) 


2747683 


2749111 


2749162 


2752103 
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2756739 
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2757863 
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2753237 
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2753804 
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2756851 
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2759200 


2761649 






SEQ 

NO. 
(a.a.) 


6343 


i s 


6345 


6346 


r*- 

•<T 

CO 
CD 


co 

CO 
CO 


6349 


6350 


6351 
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Function 


pyruvate oxidase 


multidrug efflux protein 


transcriptional regulator 


hypothetical membrane protein 




3-ketosteroid dehydrogenase 


transcriptional regulator, LysR family 


hypothetical protein 


hypothetical protein 




hypothetical protein 


hypothetical membrane protein 


transcription initiation factor sigma 


trehalose-6-phosphate synthase 




trehalose-phosphatase 


glucose-resistance amylase 
regulator 


high-affinity zinc uptake system 
protein 
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length 
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Similari 
(%) 


75.6 


6'89 


68.5 


78.4 




62.1 


69.0 


52.9 


55.6 
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m 


64.0 


50.3 
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57.6 


60.2 


46.7 
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Table 1 (continued) 


Homologous gene 

i 


Escherichia coli K12 poxB ' 


Staphylococcus aureus plasmid 
pSK23 qacB 


Escherichia coli K12 ycdC j 


Mycobacterium tuberculosis 
H37Rv Rv2508c 




Rhodococcus erythropolis SQl 
kstD1 


Bacillus subtilis 168 alsR |, 


Mycobacterium tuberculosis 
H37Rv Rv3298c tpqC 


Bacillus subtilis 168 ykrA 




Oryctolagus cuntculus kidney 
cortex rBAT 


Mycobacterium tuberculosis 
H37Rv Rv3737 


Streptomyces griseus hrdB 


Schizosaccharomyces pombe 
tps1 




Escherichia coli K12 otsB 


Bacillus megaterium ccpA 


Haemophilus influenzae Rd 
HI0119znuA ii 
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Function 


dihydrodipicolinate synthase 


glucokinase 


N-acetylmannosamine-6-phosphate 
epimerase 




sialidase precursor 


L-asparagine permease operon 
repressor 


dipeptide transporter protein or 
heme-binding protein 


dipeptide transport system 
permease protein 


oligopeptide transport ATP-binding 
protein 


oligopeptide transport ATP-binding 
protein 


homoserine/homoserin lactone 
efflux protein or lysE type 
transtocator 


leucine-responsive regutatory 
protein 




hypothetical protein 


hypothetical protein 


transcription factor 
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length 
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Homologous gene 


Escherichia coli K12 dapA 


Streptomyces coelicolor A3(2) 
SC6E10.20C glk 


Clostridium perfringens NCTC 
8798 nanE 




Micromonospora viridifaciens 
ATCC 31146 nadA 


Rhizobium etli ansR 


Bacillus firmus OF4 dppA 


Bacillus firmus OF4 dappB 
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Function 


GTP cyclohydrolase I 




cell division protein FtsH 


hypoxanthine 
phosphoribosyltransferase 


cell cycle protein MesJ or cytosine 
deaminase-related protein 


D-alanyi-D-atanine 
carboxypeptidase 


inorganic pyrophosphatase 




spermidine synthase 


hypothetical membrane protein 


hypothetical protein 


hypothetical protein 


hypothetical protein 


PTS system, beta-glucosides- 
permease II ABC component 




ferredoxin reductase 


hypothetical protein 


bacterial regulatory protein, marR 
family 
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length 
(a.a) 
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Similarity 


86.2 




69.0 


83.0 


66.8 


51.4 


73.6 




80.7 


86.4 


63.2 


60.1 


72.3 


59.6 




to 

CO 
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73.2 
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30.3 
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Homologous gene 
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phenylacetaldehyde dehydrogenase 


hypothetical protein 
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Function 


virulence factor 


virulence factor 


virulence factor 

. 


sodium/glutamate symport carrier 
protein 


cadmium resistance protein 


cation efflux system protein 
(zinc/cadmium) 


monooxygenase or oxidoreductase 
or steroid monooxygenase 


alkanal monooxygenase alpha chain 




cystathionine gamma-tyase 


bacterial regulatory protein, lacl 
family 


rifampin ADP-ribosyl transferase 


rifampin ADP-ribosyl transferase 


hypothetical protein 


hypothetical protein 


oxidoreductase 
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Table 1 (continued) 


Homologous gene 
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ORF24222 


Pseudomonas aeruginosa 
ORF23228 


Pseudomonas aeruginosa 
ORF25110 


Synechocystts sp. PCC6803 
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Staphylococcus aureus cadC 


Pyrococcus abyssi Orsay 
PAB0462 


Rhodococcus rhodochrous 
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Kryptophanaron alfredi symbiont 
luxA 




Escherichia coli K12 metB 


Streptomyces coelicolor A3(2) 
SC1A2.11 


Streptomyces coelicolor A3(2) 
SCE20.34c arr 


Streptomyces coelicolor A3(2) 
SCE20.34c arr 


Mycobacterium tuberculosis 
H37Rv Rv0837c 


Mycobacterium tuberculosis 
H37Rv Rv0836c 


Mycobacterium tuberculosis 
H37Rv Rv0385 
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Function 










hypothetical membrane protein 


hypothetical protein 




sulfate adenylyltransferase, subunit 
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sulfate adenylyltransferase small 
chain 


phosphoadenosine phosphosulfate 
reductase 


ferredoxin-nitrate reductase 


ferredoxin/ferredoxin-NADP 
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hunttngtin interactor 
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ammonia monooxygenase 
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hypothetical protein 




hypothetical protein 


ABC transporter 


ABC transporter 


metabolite transport protein homolog 






succinyl-diaminopimelate 
desuccinylase 








dehydrin-like protein 


maltose/maltodextrin transport ATP- 
btnding protein 




cobalt transport protein 


NADPH-flavin oxidoreductase 


inosine-uridine preferring nucleoside 
hydrolase 


hypothetical membrane protein 


DNA-3-methyladenine glycosylase 
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Bacillus subtilis ydeG 
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Daucus carota 


Escherichia coli K12 malK 
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pNZ4000 Orf-200 cbiM 


Vibrio harveyi MAV frp 


Crithidia fasciculata iunH 


Streptomyces coelicolor A3(2) 
SCE20.08c 


Escherichia coli K12tag 


Alcaligenes eutrophus H16 fhp , 


db Match 


> 

°<, 

CO 
rvl 

> 
CL 
CO 




sp:YGB7_ALCEU 


gp:HIU68399_3 


gp:HIU68399_3 


pir:A69778 






sp:DAPE_ECOLI 








GPU:DCA297422 
1 


_j 
O 
0 

LU 

*' 

— 1 
< 

5 

CL 

CO 




gp:AF036485_6 


< 

X 
CD 
> 

I 

cl- 
ot 

Li_ 

CL 
€rt 


< 

LU 

or 
0 

x' 

3 

CL 
CO 


CO 

o' 

CN 
LU 
O 
CO 
CL 

cn 


sp:3MG1_ECOLI 


3 

UJ 

O 
_i 
< 

<■ 

CL 

X 
cu 
in 


si 


m 

CO 
OJ 


CD 

in 


1002 


CO 
CO 
CD 




1209 


CN 
CN 
CO 


CO 
CD 


1323 


1905 | 


r*- 


CN 
CD 

h- 


in 

CO 


1068 


CN 
CO 


CO 
CO 


CO 
CO 


CO 
O 

CO 


LO 
CO 


CO 

00 

LO 


1158 


Terminal 
(nt) 


3011273 


3011242 


3011808 


3013106 


3013837 

i 


3015824 


3014648 


3016924 


3015827 


3019220 


3018312 


3017420 


3018123 


3019542 


3020561 


3021208 


3022113 


3022998 


3025353 


3026139 


3026142 


Initial 
(nt) 


3010989 


3011805 


3012809 


3013798 ; 


| 3014550 


3014616 


3015469 


3016238 


3017149 


3017316 


3017539 


3018181 


3019076 


3020609 


3021202 


3021825 


3022928 


3023900 


3024379 


3025552 


3027299 [ 


82 5 

to 2 2. 


6605 


9099 


6607 


6608 


6609 


6610 


6611 


6612: 


6613 


6614 


6615 
'6616 

6617 


6618 


6619 


6620 


6621 
6622 


6623 


6624 


in 

CN 
CD 
CD I 


SEQ 
NO. 
(DNA) 


3105 


3106 


3107 


3108 
3109 


3110 


CO 


3112 . 


3113 


3114 


3115 
3116 

3117 


3118 


3119 


3120 


3121 

"MOO 


3123J 
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)eta- 

r otein 
































5 


o 








linator or t 
gulatory pi 




6-phospho-beta-glucosidase 




rosidase 


sferase 






hypothetical membrane protein 




r ogenase 


iphate 








beta-N-Acetylglucosaminidase 


10 


Functi 




oxidoreductase 




transcription antitern 
glucoside positive re 






6-phospho-beta-gIuc 


aspartate aminotran: 




transposase (ISCg2] 




UDP-glucose dehydi 


deoxycytidine triphos 
deaminase 




hypothetical protein 




15 


Matched 
length 
(a.a) 




o 

CN 




CN 
CO 




CO 




CO 

to 


CN 
O 




o 


CO 
CO 
CO 




CN 


CO 
CO 




cn 

CN 
(X 




o 












































20 


Similari 




63.8 




69.3 




59.9 




78.8 


80.9 




100.0 


70.2 




72.2 


72.3 




■<r 

CD 

m 




58.1 




Ideptity 




co 
ro 




28.1 




43.7 




O) 
CO 


r- 

~co 
to 




100.0 


33:6 




40.^5 


43.5 




30.6 




28.5 


25 

CD 

c 
o 

30 

tp 

XI 

I— 

35 


Homologous gene '! 




Streptomyces coelicolor A3(2) 
mmyQ 




'I 

Escherichia coli K12 bgIC 




Clostridium longisporum B6405 
abgA i 




Clostridium longisporum B6405 
abgA 


Methylobacillus flagellatus aat 




Corynebacterium glutamicum 
ATCC 13032 tnp 


Streptomyces coelicolor A3(2) 
SCQ11.10C 




Sinorhizobium meltloti rkpK 


Escherichia coli K12 dcd 




Streptomyces coelicolor A3(2) 
SCC75A.16C 




Streptomyces thermovtolaceus 
nagA 


40 


db Match 




gp:SC0276673J8 




_j 
O 
o 

LU 

1 

o 

_J 
O 

CD 
Ol 
i/y 




sp:ABGA_CLOLO 




sp:ABGA_CLOLO 


CN 

«' 

to 

CD 
CO 

_i 
Cl 
CO 




CD 
CO 

u_ 
< 

CL 
CO 


|gp:SCQ11_10 




m 

CO 
ro 

CN 
CN 
■V 
CN 

Cl 


sp:DCD_ECOLI 




to 

<■ 

LO 

u 
o 

V) 
CL 

cn 




gp:AB008771_1 




ORF 

l D PJ 


ro 
o 

CD 


T 

csj 

CO 


to 
m 


OJ 

m 


o> 

CN 


o 

CD 
CO 


CO 
CO 


o 

CN 


1257 j 


o 
o 

CO 


1203 


1257 


CO 
CO 


1317 


r- 
to 
in 


ro 
CN 


r- 
r- 


1689 


1185 


45 


Terminal 
(nt) 


3028163 


3028891 


3029033 


3028884 


3029782 


3029702 


3030535 


3030101 


3031979 


3032348 


3033863 


3035437 


3034105 


3035440 


3036845 


3037911 


3038942 


3038993 


3040748 


50 


Initial 
(nt) 


3027561 


3028268 I 


3028878 


3029474 


3029504 


3030061 


3030155 


3030340 


3030723 


3032647 


3032661 


3034181 


3034287 


3036756 


3037411 


3037675 


3038172 


3040681 


3041932 




SEQ 

NO. 
(a.a.) 


6626 


6627 


6628 j 


6629 


6630 


6631 


6632 


6633 


6634 


6635 


j 6636 


6637 


6638 


6639 


o 

T 

to 

CO 


6641 


CN 

to 
to 


6643 


T 

to 
to 


55 


SEQ 
NO. 
(DNA) 


3126 


3127 


3128 


3129 


3130 


3131 


! 3132 


3133 


3134 


3135 


to 

CO 

ro 


3137 


3138 


3139 1 


3140 


3141 


3142 


ro 

T— 

ro 


•c 
ro 
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10 



20 



25 



30 



35 



40 



45 



50 



55 



Function 






hypothetical protein 






hypothetical membrane protein 


acyttransferase or macrolide 3-0- 
acyltransferase 




hypothetical membrane protein 




hexosyltransferase 


methyl transferase 


phosphoenolpyruvate carboxykinase 
(GTP) 


C4-dicarboxylate transporter 


hypothetical protein 


hypothetical protein 


mebrane transport protein 




Matched 
length 
(aa) 






1416 






CO 
CO 
CO 


CO 

o 

-<r 




CO 
CM 

in 




cn 

CD 
CO 


in 

CM 


o 

CO 


CM 
CO 
CO 


CM 


o 

CM 


CO 

CO 

r-. 




Similarity 






49.4 






t-~ 

T 


51.0 




54.8 




79.1 


73.3 


78.5 


52.7 


67.2 


85.0 


72.3 




-C-S" 

■D 






2G.6 






24.8 


III 




3112 




CO 

in 


5P 

CO 

m 


54.7 








42.3 




CM 


iri 

CO 


cn 

CO 


Homologous gene 






Mycobacterium leprae 
MLCB1883.13C 






Mycobacterium leprae 
MLCB1 883.05c 


Streptomyces sp. acyA 




Mycobacterium leprae 
MLCB1883.04.: 




Mycobacterium tuberculosis 
H37Rv Rv0225 


Mycobacterium tuberculosis 
H37Rv Rv0224c 


Neocallimastix frontalis pepck 


Pyrococcus abyssi Orsay 
PAB2393 


Escherichia coli K12 yggH 


Mycobacterium tuberculosis 
H37Rv Rv0207c 


Mycobacterium tuberculosis 
H37Rv Rv0206c mmpL3 




db Match 






gp:MLCB1883_7 






gp:MLCB1883_4 


pir:JC4001 




gp:MLCB1883_3 




pir.G70961 


pir:F70961 


sp:PPCK_NEOFR 


pir:E75125 


_i 
O 
O 

LU 

I 

X 
CD 
CD 
> 

CL 
V) 


pir;E70959 


pir;C70839 




si 




o 

CM 


3129 


CM 
CO 


in 
o> 


CO 

o 

cn 


1068 


co 
o 


CM 
CM 


cn 

CO 
CO 


1137 


h~ 

r- 


1830 


1011 


in 

CO 

r- 


in 
o 
r-- 


2316 


1422 


Terminal 
(nt) 


3042437 


3042703 


3045788 


3043022 


3045990 


3048048 


3046122 


3047197 


CD 

r-- 
■*r 
co 

O 

CO 


3051190 


3049456 


3051964 


3052062 


3055769 


3056631 


3057317 


3059643 


3058096 


Initial 
(nt) 


3041994 


3042503 


3042660 


3043642 


3045796 
3047146 


3047189 


3047904 


3048058 


3050522 


3050592 


3051194 


3053891 


3054759 


3055867 


3056613 


3057328 


3059517 


SEQ 

NO. 
(a a.} 


6645 


CD 

•^r 

CD 
CO 


6647 


6648 


6649 


6650 
6651 


6652 


6653 


6654 


6655 


6656 


6657 


6658 


6659 


0999 


6661 


6662 


SEQ 
NO. 
(DNA) 

3145 


3146 


3147 


3148 


cn 

CO 


3150 


3151 


3152 


i 3153 


m 

CO 


3155 


3156 


3157 


3158 


3159 


3160 


3161 


cn' 
CO I 

co! 
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5 
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Function 


hypothetical membrane protein 


hypothetical membrane protein 


propionyl-CoA carboxylase complex 
8 subunit 


polyketide synthase 


acyl-CoA synthase 


hypothetical protein 




major secreted protein PS1 protein 
precursor 






antigen 85-C 


hypothetical membrane protein 


nodutation protein 


hypothetical protein 


hypothetical protein 




phosphatidic acid phosphatase 


15 




Matched 
length 
(aa) 


■<r 

CO 
CO 


CO 

o 


CO 
CN 
LO 


1747 


CN 

cn 

LO 


CO 

co 




h- 

LO 
CO 






CO 
CO 


r-» 
co 
CO 


in 

CO 
CN 


CO 
CO 


to 

CO 
CO 




O 










































20 




Similari 


62.9 


69.4 


76.9 


CM 
XT 

to 


62.3 


67.4 




99.5 






62.5 


61.2 


51.5 


75.0 


74.7 




56.5 






Identity 


29.1 


34.3 


cri 

T 


CN 

o 

CO 


LO 
CO 
CO 


CO 
CD 
CO 




98I6 






36. : 3 


37.5 


27.1 


51.2 


55.6 




28.2 


25 
30 
35 


Table 1 (continued) 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv0204c 


Mycobacterium tuberculosis 
H37Rv Rv0401 


Streptomyces coelicolor A3(2), 
pccB 11 


Streptomyces erythraeus eryA 


Mycobacterium bovis BCG ! 


Mycobacterium tuberculosis 
H37Rv Rv3802c 




Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
17965 cop1 ; 






Mycobacterium tuberculosis j 
ERDMANN RV0129C fbpC | 


Mycobacterium tuberculosis 
H37Rv Rv3805c 


Azorhizobium caulinodans '' 
ORS571 noeC 


Mycobacterium tuberculosis ;| 
H37Rv Rv3807c 


Mycobacterium tuberculosis 
H37Rv Rv3808c ii 




Bacillus licheniformis ATCC 
9945A bcrC 


40 




db Match 


pir:A70839 


pir:H70633 


gp:AF113605J 


sp:ERY1_SACER 


prf:2310345A 


pir:F70887 




sp;CSP1_CORGL 






\- 
O 
>- 

2 , 

o 

CO 

CO 

< 
cL 

</l 


pir:A70888 


sp;NOEC_AZOCA 


pir:C70888 


pir:D70888 




sp:BCRC_BACLI 








1083 


CO 
CO 
CO 


1548 


4830 


1788 


r-. 

CN 

cn 


co 

CO 


1971 


1401 


CD 
CN 


1023 


2058 


to 

CO 
CO 


o 

CO 


1968 


1494 


r»- 
r*- 


45 




Terminal 
(nt) 


3060733 


3061095 


3061380 


3062951 


3068143 


3070214 


3071147 


3071650 


T 

in 
r- 
o 

CO 


3073857 


3075540 


3076715 


3078853 


3079848 


3080344 


3083960 


3083935 


50 




Initial 
(nt) 


3059651 


3060733 


3062927 


3067780 


3069930 


3071140 


3071644 


3073620 


r- 

V 

o 

o 

CO 


3074075 


3076562 


3078772 


3079848 


3080351 


3082311 


3082467 


3084411 






SEQ 

NO. 
(a.a.) 


6663 


6664 


6665 


9999 


6667 


6668 


6669 


6670 


6671 


6672] 


6673 


6674 


6675 


6676 


6677 


6678 


6679 


55 




SEQ 
NO 
(DNA) 


C9LC 


3164 


j 3165 


l 3166 


s 

L" 


3168 


3169 


3170 


3171 


[CN 
| CO 


3173 




3174 


3175 


3176 


3177 


3178 


3179 
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Function 






dimethylaniline monooxygenase (N- 
oxide-forming) 




UDP-galactopyranose mutase 


hypothetical protein 


glycerol kinase 


hypothetical protein 


acyttransferase 


seryl-tRNA synthetase 


transcriptional regulator, GntR family 
or fatty acyl-responsive regulator 


hypothetical protein 


hypothetical protein 




2,3-PDG dependent 
phosphoglycerate mutase 




nicotinamidase or pyrazinamidase 




15 


Matched 
length 
(a.a) 






ro 




r-» 
r- 
co 


cn 
u-> 

CO 


cn 
cn 


cn 

CM 


CD 
CN 


CO 


CO 

to 

CM 


CD 
tD 
to 


CO 




CO 

CN 




o 

CO 












































20 


Similari 






50.4 




cn 
CM 


CO 

r-' 


CO 
CO 


CO 

o 


O 
CM 


to 

CO 


r- 

to 


61.2 


r-. 
cri 
r- 




62.8 




50.9 






Iderjtity 
(%) 






24)4 




CM 
CO 

■c- 


to 

oS 

CM 


51.7 


CD 


46.I7 


70.2 


27.7 


32,5 


_o 

CD 
T 




CM 
CO 




CM 




Co CO M 

Table 1 (continued) 


Homologous gene I 






Sus scrofa fmol |; 




Escherichia coli K1 2 glf 


Mycobacterium tuberculosis 
H37Rv Rv3811 csp j. 


Pseudomonas aeruginosa 
ATCC 15692 glpK 


Mycobacterium tuberculosis 
H37RvRv3813c 


Mycobacterium tuberculosis ■ 
H37RvRv3816c 


Mycobacterium tuberculosis 
H37Rv i. 


Escherichia coli K12 farR 


Mycobacterium tuberculosis 
H37Rv Rv3835 


Mycobacterium tuberculosis \ 
H37Rv Rv3836 '! 




Amycolatopsis methanolica pgm 




Mycobacterium smegmatis pzaA 




40 


db Match ; 






sp:FM01_PIG 




sp:GLF_ECOLI 


pir:G70520 


LU 
< 
UJ 
CO 

a 

i 

CL 
_l 

O 
a. 
(/» 


pir:A7052l 


pir:D70521 


uo 
CO 
■*r 

CD 
CM 

§ 

CL 

1SI 

cn 


_i 

O 

CJ 
LU 

1 

or 
< 

CL 
to 


pir:H70652 


pir:A70653 




gp:AMU73808_1 




prf:2501285A 






UL — 

g£ 


r- 
r- 


o 


1302 


CM 
CO 


1203 


2049 


1527 


TT 
CO 
CO 


CO 

co 


1266 




1113 


CN 
CO 


CO 

CO 


CO 
CO 
CO 


o 

CO 
CO 


1143 


CO 
CN 

r— 


45 


Terminal 
(nt) . 


T 

CM 

CD 

o 
n 


3085218 


3087048 


3088276 


3087101 


3090664 


3090760 


3092342 


3093175 


3094078 


3096287 


3097423 


3097764 


3097780 


3097904 


3099454 


3100698 


3101426 


50 


Initial 
{nt} 


3085200 


3085727 


3085747 


3087665 


3088303 


3088616 


3092286 


3093175 


3094050 


3095343 


3095574 


3096311 


3097423 


3097878 


3098572 


3098825 


3099556 


3100698 




2§ « 

co 2 5 


0899 


6681 


6682 


6683 


6684 


6685 


6686 


6687 


6688 


6689 


6690 


6691 


6692 


6693 


6694 


6695 


9699 


6697 


55 


Set 

| oo ^ g. 


: 3180 


3181 


3182 


3183 


I 3184 


3185 


3186 


3187 


3188 


3189 


3190 


1 

3191 I 


3192 


3193~1 


3194 


3195 


3196 


3197 
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Function 


transcriptional regulator 








hypothetical protein 


glucan 1,4-alpha-glucosidase 




glycerophosphoryl diester 
phosphodiesterase 


gluconate permease 






pyruvate kinase 


L-lactate dehydrogenase 


hypothetical protein 


hydrolase or haloacid 
dehalogenase-like hydrolase 


efflux protein 


transcription activator or 
transcriptional regulator GntR family 


phosphoesterase 


shikimate transport protein 


15 




Matched 
length 
(aa) 


O 
CO 
CO 








o 


CM 
CO 




CO 
LO 
CM 


CO 
LO 






cn 


CO 


CO 
CM 
LO 


CM 
CM 


CO 

CO 


CM 
CM 


LO 
IO 
CM 


CM 
CM 

■*r 


20 




Similarity | 
(%) 


57.1 








81.3 


55.3 




54.1 


71.9 j 






47.7 


99.7 


64.8 


58.5 


67.6 


57.0 


68.6 


74.4 






Identity 


CD 

ro 








cn 

CO 


CO 
CM 




_Q_ _ 
CO 
CM 


_co 

CO 






25.5 


cn 
cn 


33.5 


32.1 


' 39.9 


27.6 


47.8 


CO 


25 
30 
35 


Table 1 (continued) 


Homologous gene 


Streptomyces coelicolor A3(2) 
SC6G4.33 








Streptomyces lavendulae 
ORF372 


Saccharomyces cerevisiae 
S288C YIR019Csta1 i! 




Bacillus subtilis glpQ 


Bacillus subtilis gntP 






Corynebacterium glutamicum 
AS019pyk 


Brevibacterium flavum IctA 


Mycobacterium tuberculosis 
H37RvRv1069c 


Streptomyces coelicolor A3{2) 
SC1C2.30 


Brevibacterium linens ORF1 
tmpA 


Escherichia coliKI 2 MG1655 
glcC 


Mycobacterium tuberculosis 
Ih37RvRv2795c 


Escherichia coli K12 shiA 


40 




db Match 


co 

CO 

I 

o 

CO 

O 
to 

Ol 

cn 








CM 

r*» 
co 

CO 
CM 

CD 
a. 


sp:AMYH_YEAST 1 




CO 

o 
< 

CO 

1 

O 
a 
_j 
O 

Cl 
<n 


sp:GNTP_BACSU j 






sp:KPYK_CORGL 


gsp:Y25997 


pir:C70893 


gp:SC1C2_30 


j 

CO 

S 

CO 

o 

< 

cn 


i 

O 
o 

Ol 

i 

o 
o 
_l 

p. 

CL 

to 


pir:B70885 


sp:SHIA_ECOLI 






81 


1035 


o 

CM 


CM 
LO 
LO 


o 
co 


r*- 

CM 
CO 


1314 


CO 

cn 


cn 
co 


1389 


CM 
T 
CO 


cn 

LO 


' 1617 


CM 
T 
CO 


1776 


CO 
CO 
CO 


co 
to 


CO 

cn 

CO 


CO 
CO 

r-. 


1299 


45 




Terminal 
(nt) 


3102768 


3101744 


3102079 


3103763 


3104252 


3105719 


3106053 


3106951 


3109519 


3108823 


3110003 


3110464 


3112449 


3115394 


3116042 


3116621 


3117332 


3118121 


3119582 


50 




Initial 
(nt) 


3101734 


3101863 


3102630 


3102894 


3103926 


3104406 


3106970 


3107769 


3108131 


3109464 


3109845 


3112080 


3113390 


3113619 


3115407 


3116079 


3116640 


3117336 


3118284 






So 


6698 


6699 


6700 


6701 


6702 


6703 


6704 


6705 


6706 


6707 


6708 


6709 


o 

r— 
CO 


6711 


6712 


6713 


6714 


6715 


6716 


55 




SEQ 
NO. 
(DNA) 


3198 


! 3199 


j 3200 


3201 


3202 


3203 


3204 


3205 


3206 


3207 


3208 


3209 


3210 


3211 


3212 


3213 


3214 


•* 

3215 


CO 

CM 
CO 
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Function 


L-lactate dehydrogenase or FMN- 
dependent dehydrogenase 




immunity repressor protein 






phosphatase or reverse 
transcriptase (RNA-dependent) 




peptidase or lAA-amino acid 
hydrolase 




peptide methionine sulfoxide 
reductase 


superoxide dismutase (Fe/Mn) 


transcriptional regulator 


multidrug resistance transporter 








hypothetical protein 


membrane transport protein 


transcriptional regulator 


two-component system response 
regulator 


15 


Matched 
length 
<a.a) 


to 

co 




in 
m 






CD 

to 
m 




CN 
CN 




o 

CN 


CO 


CN 

cn 

CN 


00 
CO 








CO 
CN 




r-. 

CO 


CN 
CN 


20 


I 

Similarity 


68.9 ; 




80.0 






51.3 




63.1 




69.1 


r- 

CN 
CO 


65.8 


49.0 








64.8 


59.3 


65.0 


75.5 




Ideptrty 
(%) 






m 






29.5 




36.9 




CD 
T 


82.3 


m 

CN 
CO 


2314 








33 J 8 


CO 

r-" 

CN 


CN 

r«-' 
CO 


CO 

o 
m 


25 

T) 
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ZJ 

c 

o 

30 

0J 
JO 

35 
40 


Homologous gene 


Netsseria meningitidis lldA 




Bacillus phage phi-105ORF1 j 






Caenorhabditis elegans 
Y51B11A.1 




Arabidopsis thaliana i I1 1 : 




Escherichia coli B msrA 


Corynebacterium 
pseudodiphtheriticum sod 


Bacillus subtilis gltC 


Corynebacterium glutamicum 
tetA 








Mycobacterium tuberculosis 
H37Rv Rv3850 


Streptomyces cyanogenus lanJ 


Bacillus subtilis 168yxaD 


Corynebacterium diphtheriae 
chrA 
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sp:RPC_BPPH1 
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21992 
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t 

CO 

NT 

r- 

CN 
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33030 


31508 
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CO 


ro 


CO 


CO 


CO 


ro 


CO 


CO 


CO 


CO 


ro 


CO 


CO 


CO 


CO 


CO 


CO 


ro 


ro 


CO 


50 


Initial 
(nt) 


1196651 


120909 


121598 


22129 


23222 


24172 


24886 


25298 | 


25343 


m 

CO 
CN 


26392 


28417 


28606 


29785 


32920 


33028 


33115 


35268 


35297 I 


36491 
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CO 


CO 


ro 


ro 


CO 
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ro 
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CO 
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NO. 
(a.a.) 


6717 


6718 


6719 
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6721 


6722 


6723] 


6724 
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6731 
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r 
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3236 
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Function 






two-component system sensor 
histidine kinase 


hypothetical protein 


hypothetical protein 


stage III sporulation protein 


transcriptional repressor 


transglycosylase-associated protein 


hypothetical protein 


hypothetical protein 


RNA pseudouridylate synthase 


hypothetical protein 


hypothetical protein 




bacterial regulatory protein, gntR 
family or glc operon transcriptional 
activator 


hypothetical protein 


hypothetical protein 


Matched 
length 

(a.a) 






CO 

o 


CO 
T 
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in 
to 

CM 


CM 
CD 


CO 


CD 
CO 
CM 


CO 


rr 

CO 
ro 


CO 


CM 




O 

o 


CO 

CO 


r- 

CD 
CM 
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64.5 


79.2 


59.2 


53.6 


60.9 


71.3 


69.6 
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66.0 
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CO 
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o 

CO 


CO 
CM 


CO 


Homologous gene 






Corynebaclerium diphtheriae 
chrS 


Streptomyces coelicolor A3(2) , 
SCH69.22c 


Streptomyces coelicolor A3(2) 
SCH69.20c 


Bacillus subtilis spolllJ 


Mycobacterium tuberculosis 
H37Rv Rv3173c 


Escherichia coli K12 MG1655 
tagi | 


Mycobacterium tuberculosis 
H37Rv Rv2005c 


Escherichia coli K1 2 MG1655 
yhbW 


Chlorobium vibrioforme ybc5 


Chlamydia pneumoniae 


Chlamydia muridarum Nigg 
TC0129 




Escherichia coli K12 MG1655 
glcC 


Streptomyces coelicolor 
SC4G6.31c 


Mycobacterium tuberculosis •< 
H37Rv Rv2744c 
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1 
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tn 

CM 
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3137884 
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3139651 


3141523 


3141969 


3143356 


3144482 
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Function 












methyltransferase 


nodulin 21-related protein 








transposon tn501 resolvase 




ferredoxin precursor 


hypothetical protein 


transposase 


transposase protein fragment 
TnpNC 




glyceraldehyde-3-phosphate 
dehydrogenase (pseudogene) 


lipoprotein 


copper/potassium-transporting 
ATPase B or cation transporting 
ATPase (E1-E2 family) 
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Saccharopolyspora erythraea fer 


Streptomyces coelicolor A3(2) 


Corynebacterium glutamicum 
Tnp1673 


Corynebacterium glutamicum 




Pyrococcus woesei gap 


Synechocystis sp. PCC6803 
SH0788 j 


Archaeoglobus fulgidus AF0152 
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Function 


transposase (IS1628) 


thioredoxin 




transmembrane transport protein or 
4-hydroxybenzoate transporter 




hypothetical protein 


replicative DNA helicase 




SOS ribosomal protein L9 


single-strand DNA binding protein 


30S ribosomal protein S6 




hypothetical protein 




penicillin-binding protein 


hypothetical protein 


bacterial regulatory protein, marR 
family 


hypothetical protein 




hypothetical protein 


hypothetical protein 


ABC transporter ATP-binding protein 
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length 
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62.5 
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72.0 


65.0 


61.8 
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Table 1 (continued) 


Homologous gene 


Corynebacterium glutamicum 
22243 R-plasmid pAGl tnpB 


Escherichia coli K1 2 thi2 ! 




Pseudomonas putida pcaK 




Escherichia coli K12 yqjl 


Escherichia coli K12 dnaB 1 




Escherichia coli K12 RL9 jt 


Escherichia coli K1 2 ssb 


Escherichia coli K12 RS6 




Mycobacterium smegmatis 
mc(2)155 




Bacillus subtilis ponA 


Mycobacterium tuberculosis 
H37Rv Rv0049 


Mycobacterium tuberculosis 
H37Rv Rv0042c 


Mycobacterium tuberculosis 
H37Rv Rv2319c yofF 




Bacillus subtilis yhgC 


Escherichia coli K1 2 yceA ' 


Escherichia coli K12 ybjZ 
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3187042 
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3193252 


50 




Initiat 
(nt) 


31776B3 


3178558 


3178609 


CO 
TT 
O 
CO 

r- 
co 


3161104 


3181126 


3182866 


3183469 


3183927 


3184661 


3184985 


3185536 


3186993 


3187912 


3189201 
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NO. 
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Function 


ABC transporter ATP-btnding protein 


hypothetical protein 


hypothetical protein 




DNA protection during starvation 
protein 


formamidopyrimidine-DNA 
gtycosylase 


hypothetical protein 






methylated-DNA--protein-cysteine 
S-methyltransferase 


zinc-binding dehydrogenase or 
quinone oxidoreductase 
(NADPRquinone reductase) or 
alginate lyase 




membrane transport protein 


malate oxidoreductase [NAD] (malic 
enzyme) 


gfuconok/nase or gluconate kinase 


teicoplanin resistance protein 


teicoplanin resistance protein 
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Homologous gene 


Escherichia coli K12 MG165S 

ybjz 


Campylobacter jejuni Cj0606 


Mycobacterium tuberculosis : 
H37Rv Rv0046c 




Escherichia coli K12 dps 


Escherichia coli K12 mutM or 

fpg 


Escherichia coli K12rtc8 






Homo sapiens mgmT 


Cavia porcellus {Guinea pig) qor 




Mycobacterium tuberculosis 
H37RvRv0191 ydeA \ 


Corynebacterium melassecola 
(Corynebacterium glutamicum) 
ATCC 17965 malE 


Bacillus subtilis gntK 'I 


Enterococcus faecium vanZ j 


Enterococcus faecium vanZ 
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Function 


maieyiacetate reductase 


sugar transporter or O-xylose-proton 
symporter (D-xylose transporter) 


bacterial transcriptional regulator or 
acetate operon repressor 


oxtdoreductase 


diagnostic fragment protein 
sequence 


myo-inositol 2-dehydrogenase 


dehydrogenase or myo-inositol 2- 
dehydrogenase or streptomycin 
biosynthesis protein 


phosphoesterase 








stomatin 




DEAD box RNA helicase family 


hypothetical membrane protein 




phosphomethylpyrimidine kinase 


mercuric ion-binding protein or 
heavy-metal-associated domain 
containing protein 


ectoine/proline uptake protein 


Matched 
length 
(aa) 


r- 
m 
CO 


CO 
LO 


O 
CO 
CM 


r-» 
to 

CO 


o 

CN 


CN 
CO 

cO 


CO 
CO 


1242 








CO 

o 

CM 




1660 


t— 




in 

CM 


CD 


r- 
cn 

CN 


Similarity 


75.5 


58.3 


60.7 


55.7 | 


58.2 


59.6 


62.4 


r— 

CN 
CO 








57.3 




80.2 


61.0 




76.8 


70.1 


62.3 


Identity 
('%) 


43.0 


31.4 


LO . 
CN 


CN 

CN 


25.9 


tn 

CN 


CO 


33.3 








CO 
CN 




«j 
m 


34.8 | 




*T 

O 

in 


ro 
CD 

V 


CD 
O) 
CM 


Homologous gene 


LO 
CL 

d 
(/» 

U) 
CO 

c 
o 
E 
o 
x> 

=3 

<u 

i/> 
CL 


Escherichia coli K12 xylE 


Salmonella typhimurium icIR 


Escherichia coli K1 2 ydgJ 


Listeria innocua strain 4450 ' 


Sinorhizobium meliloti idhA 


Streptomyces griseus strl 


Bacillus subtilis yvnB 








Caenorhabditis elegans unci 




Mycobacterium bovis BCG 
RvD1-Rv2024c 


Mycobacterium leprae u2266k 




Bacillus subtilis thiD 


Bacillus subtilis yvgY 


Corynebacterium glutamicum 
proP 


db Match 


a 

CO 
LU 
CO 
0_ 

I 

LL 

£0 
O 

h- 

a. 
to 


8 

UJ 

I 

UJ 

_J 

> 

X 
ex. 

tn 


sp:ICLR_SALTY 


sp:YDGJ_ECOLI 


gsp:W61761 


3 
CO 

o 
< 

o 

CN 

5 

Cl 
»/) 


sp;STRI_STRGR 


o 
o 
r«- 

o 

a. 








_j 

UJ 
LU 
< 

o 

u 
2: 

Ol 

CO 




m 
o 
to 

CO 

o 

CD 

5 

CL 

cn 


prf:2323363AAM 




CO 

3 

X 

t- 

Ql 

m 


T 
O 
O 

LL 
Cl 


< 
in 
cn 

CN 
t— 
O 

in 

CN 
tf 
CL 


LL — 

§s 


1089 


1524 


CO 
CO 


1077 


O) 

co 


1005! 


1083, 


4032 | 


CO 


co 
CO 


1086 


r-- 


CO 
CD 


4929 


r- 
o 
m 


o 
to 

CO 


o 
O 
to 


CO 
T 
CN 


CO 
CO 


Terminal 
(nt) 


| 3257403 


3258561 


3261989 


3263221 


3264115 


3265146 


3266266 

i 


3271093 


3267913 


3268618 


3272477 


3274488 


3275602 


3276671 


3281666 


3283101 


3282347 


3283383 


3283473 


Initial 
(nt) 


3258491 


3260084 


3261129 


3262145 


3263237 


3264142 


3265184 


3267062 


3268557 


3269235 


3271392 


3275231 


3276570 
3281599 


3282172 


3282742 


3282946 


3283141 


3284309 




6881 


6882 


6883 


6884 


6885 


6886 


6887 


6888 


6889 


6890 


6891 


6892 


6893 


6894 


in 

CD 

CO 

to 


6896 


6897) 


6898 


6899 


SEQ 
NO. 
(ONA) 


3381 


3382 


3383 


3384 


3385 


3386 | 


3387 


3388 


3389 


3390 


3391 


I 3392 


co rr 
cn cn 
co | ro 

CO j CO 


3395 


3396 


3397 


3398 


3399 
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Function 




thioredoxin ch2, M-type 


N-acetylmuramoyl-L-alanine 
amidase 






hypothetical protein 


hypothetical protein 


partitioning or sporulation protein 


glucose inhibited division protein B 


hypothetical membrane protein 


ribonuclease P protein component 


SOS ribosomal protein L34 






L-aspartate-alpha-decarboxylase 
precursor 


2-isopropylmalate synthase 


hypothetical protein 


aspartate-semialdehyde 
dehydrogenase 


3-dehydroquinase 


Matched 
length 
(aa) 




cn 


CO 

cn 

v— 






CN 
CN 


r-- 
CO 

co 


CN 
CN 


ro 
in 


ro 

CO 


CO 
CN 








CO 
CO 


CO 
CO 


in 

CO 


CO 


Ol 
T 










































Similari 
(%) 




76.5 


75.4 






58.5 


60.5 


78.0 


64.7 


75.4 


59.4 


93.6 






100.0 


100.0 


100.0 


100.0 


100.0 


identity 

' (%) 




42.0 


51.0 






34.4 


37.6 


65,0 


36.0 


r- 

- ■ 


CO 

CO 
_r\t _ 


o 

CO 






100.0 


100.0 


100.0 


100.0 


100.0 


Homologous gene 




Chlamydomonas reinhardtii thi2 


Bacillus subtilis cwlB 






Mycobacterium tuberculosis 
H37Rv Rv3916c 


CN 
CD 
>s 

ra 

;cj 

3 

CL 
CO 

ro 
c 
o 
E 
o 

<U 

a. 


Mycobacterium tuberculosis 
H37Rv parB 


Escherichia coli K1 2 gidB 


Mycobacterium tuberculosis 
H37Rv Rv3921c 


Bacillus subtilis rnpA ; 


Mycobacterium avium rpmH 






Corynebacterium glutamicum 
panD 


Corynebacterium glutamicum 
ATCC 13032 leu A 


Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
13032 orfX 


Corynebacterium glutamicum 
asd 


Corynebacterium glutamicum 
AS019 aroD 


db Match 




sp:THI2_CHLRE 


sp:CWLB_BACSU 






pir;D70851 


sp:YGI2_PSEPU 


ID 
Q_ 
LU 

in 

CL 

1 

> 

CL 
VI 


sp:GIDB_ECOLI 


pir:A70852 . 


sp:RNPA_BACSU 


i 

m 
co 

CO 

5 
< 

Cl 

cn 






gp:AF116184J 


sp:LEU1_CORGL 


sp:YLEU_CORGL 


sp:DHAS_CORGL 


J 

in 

T 
CN 

LL 
< 

Cl 
CO 




1185 


CN 
CO 


1242 




1041 


CO 
CO 


1152 


r-- 
CO 
CO 


Ol 
CO 
CO 


in 
cn 


Ol 
O) 
CO 


CO 
CO 

ro 


T 

cn 

CN 


CN 
CN 
CN 


CO 

o 


1848 


in 
in 

CM 


1032 


r- 


Terminal 
(nt) 


3300119 


3301729 


3302996 


3301989 


3304475 


3302999 


3303636 


3304835 


3305864 


3306682 


3307971 


3308412 


3309321 


3308822 


147573 


266154 


268814 


271691 


CN 

in 
to 


Initial 
(nt) 


3301303 


3301358 


3301755 


3302765 


3303435 


3303616 . 


3304787 


3305671 


3306532 , 


3307632 


3308369 


3308747 


3309028 


3309043 


147980 


268001 


269068 


270660 


446075 


SEQ 
NO. 
(a.a.) 


CO 

cn 
to 


[6919 


6920 


6921 


6922 


6923 


6924 


6925, 


6926 


6927 


6928 


6929 


6930 


6931 


6932 


6933 


6934 


6935 


6936 


SEQ 
NO. 
(DNA) 


3418 


3419 


3420 


3421 


3422 


3423 


3424 


3425 . 


3426 


r- 

CN 
CO 


3428 


3429 


3430 


3431 


3432 


3433 


3434 


3435 


3436 
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3 







subuit 




c 




putative binding protein or peptidyl- 
prolyl cis-trans isomerase 




hypothetical membrane protein 




aromatic amino acid permease 










Function 


elongation factor Tu 


preprotein translocase secY 


isocitrate dehydrogenase 
(oxalosuccinatedecarboxyla* 


acyl-CoA carboxylase or biol 
binding protein 


citrate synthase 


glycine betaine transporter 


L-lysine permease 


hypothetical protein 


succinyl diaminopimelate 
desuccinylase 


proline transport system 


arginyMRNA synthetase 


Matched 
length 
(aa) 


CD 

cn 
ro 


o 

TT 


CO 

ro 

r*- 


o> 

CO 


CO 


CO 


LO 

Oi 

LO 


CO 
CN 
■*T 


o 

LO 


CO 
CO 


CO 
CO 


cn 

CO 
CO 


CN 
LO 


o 
in 
m 
































Similarii 
(%) 


100.0 


O 

d 
o 


O 
d 
o 


100.0 


100.0 


100.0 


100.0 


100.0 


100,0 


100.0 


100.0 


100.0 


100.0 


100.0 


Identity 
(%) 


100.0 


100.0 


I'OO.O 


o 

d 
o 


100.0 


100.0 


100.0 


100.0 


i 100.0 


100.0 


100.0 


100.0 


100.0 


10Q.0 






ro 


























Homologous gene 


Corynebacterium glutamicum 
ATCC 13059 tuf 


Corynebacterium glutamicum 
(Brevibacterium flavum) MJ23 
secY 


Corynebacterium glutamicum 
ATCC 13032 icd 


Corynebacterium glutamicum 
ATCC 13032 accBC 


Corynebacterium glutamicum 
ATCC 13032 gltA " 


Corynebacterium glutamicum 
ATCC 13032 fkbA 


Corynebacterium glutamicum 
ATCC 13032 betP 


Corynebacterium glutamicum 
ATCC 13032 orf2 


Corynebacterium glutamicum 
ATCC 13032 ly si 1 


Corynebacterium glutamicum 
ATCC 13032 aroP 


Corynebacterium glutamicum 
ATCC 13032 orf3 


Corynebacterium glutamicum 
ATCC 13032 dapE 


Corynebacterium glutamicum! 
ATCC 13032 putP 


Corynebacterium glutamicumi 
AS019 ATCC 13059 argS 


db Match 


sp:EFTU_CORGL 


sp:SECY_CORGL 


sp:IDH_CORGL 


prf:2223173A 


sp:CISY_CORGL 


sp:FKBP_CORGL 


sp:BETP_CORGL 


sp:YLI2_CORGL 


sp:LYSI_CORGL 


_i 

O 
oc 
O 
o 

a» 
O 
oc 
< 

Cl 
in 


CO 

in 

CN 

in 
CO 

L_* 
Ol 


prf:2106301A 


gp:CGPUTP_1 


sp:SYR_CORGL 




1188 


1320 


2214 


1773 


1311 


LiO 
CO 


1785' 


1278 


1503 


1389 


co 
cn 


1107 


1572 


1650 


Terminal 
(nt) 


527563 


570771 


677831 


718580 


879148 


879629 


946780 


1029006 


1030369 


1153295 


1154729 


1156837 


1218031 


1239923 


Initial 
(nt) 


526376 


569452 


680044 


720352 


877838 


879276 


944996 


1030283 


1031871 


1154683 


1155676 


1155731 


1219602 


1238274 




6937 



6938 


6939 


6940 


6941 


6942 


6943 


6944 


m 
cn 

CO 


6946 


6947 


6948 


cn 

CD 
CD 


6950 


SEQ 
NO. 
(DNA) 


3437 


3438 


3439 


o 
ro 


CO 


CN 
T 

CO 


ro 

TT 

CO 


3444 J 


m 
co 


3446 


*T 

CO 




cn 

CO 


o 
in 

ro 
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E c 
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cj 

in 1 

CO 

>■ 



< 

cn 



oo 

CN 



0 

or 
O 
u 

m' 

CD 

< 



a ^ < 

LU O 2 
W 2 Q 



220 



EP 1 108 790 A2 



8 



Function 


NAOH dehydrogenase 


phosphoribosyl-ATP- 
pyrophosphohydrolase 


l/l 
ro 

s? 

o 
-O 

CO 

o 

CD 

■o 

o 
o 
>* 
o 

Ol 

c 

c 
o 


ammonium uptake protein, high 
affinity 


protein-export membrane protein 
secG 


phosphoenolpyruvate carboxylase 


chorismate synthase (5- 

enolpyruvylshikimate-3-phosphate 

phosphoiyase) 


restriction endonuclease 


sigma factor or RNA polymerase 
transcription factor 


gtutamate-binding protein 


recA protein 


dihydrodipicolinate synthase 


dihydrodipicolinate reductase 


L-ma(ate dehydrogenase (acceptor) 


Matched 
length 1 
(aa) 


CO 


r-- 

00 


CN 

to 

CO 


CN 

in 




CO 
CO 


o 

r— 


CN 
CO 
CO 


CO 
CO 


m 

Ol 
CN 


to 
r- 

co 


o 

CO 


CO 
CN 


o 
o 
in 


Similarity 


100.0 


100.0 


100.0 


o 

ci 
o 


100.0 


100.0 


100.0 


100.0 


O 
O 
O 


100.0 


o 

d 
o 


o 
d 
o 


O 

d 
o 


100.0 


Identity 


100.0 


o 

d 
o 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


o 

■ d 
o 


100.0 


100.0 


100.0 


100.0 


100,0 


Homologous gene 


Corynebacterium glutamicum 
ATCC 13032 ndh 1 


Corynebacterium glutamicum; 
ASOl9hisE 


Corynebacterium glutamicum' 
ATCC 13032 ocd , 


Corynebacterium glutamicum ' 
ATCC 13032 amt I 


Corynebacterium glutamicum . 
ATCC 13032 secG 


Corynebacterium glutamicum 
ATCC 13032 ppc 


Corynebacterium glutamicum „ 
AS019 aroC 


Corynebacterium glutamicum 
ATCC 13032 cglllR 


Corynebacterium glutamicum 
ATCC 13869 sigB 


Corynebacterium glutamicum '. 
ATCC 13032 gluB 


Corynebacterium glutamicum 
AS019recA 


Corynebacterium glutamicum 
(Brevibacterium lactofermentum) 
ATCC 13869 dapA 


Corynebacterium glutamicum 
(Brevibacterium lactofermentum) 
ATCC 13869 dapB 


Corynebacterium glutamicum '■ 
R127mqo 


db Match 


gp:CGL238250_1 


o 
r*- 
co 

CO 

o 

< 

CL 

cn 


:< 

CO 

r*- 
o 
o 
_i 
CD 
O 

CL 
Ol 


gp:CGL007732_3 


gp:CGL007732_2 


prf:1509267A 


o' 

o 

CO 
CN 

U- 

< 
CL 

cn 
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s 
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CD 
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_J 

<3 
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3 

UJ 
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< 
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*r 
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CN 
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CN 
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Ol 
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m 
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CO 
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CO 
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Ol 


T 

r- 
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Terminal 
(nt) 


m 
ro 

T 

m 
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1674123 


1675268 


Ol 

TT 

o 

r- 
r- 
CO 


1677387 


1719669 


1882385 


2021846 


2061504 


2063989 


2079281 


2081191 


2113864 
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(nt) 


1544554 


1586725, 


1675208 


1676623 


1677279 
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1720898 


1880490 J 
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2081934 
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GO 2 ™ 
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6973 
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5 



10 



15 



20 



25 



35 



40 



45 



50 



55 



Function 


uridilytyltransferase, uridilylyl- 
re moving enzyme 


nitrogen regulatory protein P-ll 


ammonium transporter 


glutamate dehydrogenase (NADP+) 


pyruvate kinase 


glucokinase 


glutamine synthetase 


threonine synthase 


ectoine/proline/glycine betaine 
carrier 


malate synthase 


isocitrate lyase 


glutamate 5-kinase 


cystathionine gamma-synthase 


ribonucleotide reductase 


glutaredoxin 


Matched 
length 
fa.a) 


CN 

cn 

CO 


CN 


CO 
ro 
rr 


-<r 


m 
r- 


CO 
CN 
CO 




CO 


m 

CD 


cn 

CO 

r-» 


CN 
CO 


cn 

CO 
CO 


CO 
CO 
CO 


CO 
T 


r- 


Similarity 
(%) 


O 

o 
o 


100.0 


o 
o 
o 


100.0 


o 
o 
o 


o 
o 
o 


100.0 


100.0 


100.0 


100.0 


O 

d 
o 


100.0 


100.0 


100.0 


100.0 


Identity 


'100.0 


o 
o 


ioo.o 


iOO.O 


100.0 


100.0 


ioo.o 


100.0 


o 
d 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


Homologous gene 


Corynebacterium glutamicum 
ATCC 13032 glnD 


Corynebacterium glutamicum 
ATCC 13032 glnB 


Corynebacterium glutamicum 
ATCC 13032 amtP 


Corynebacterium glutamicum 
ATCC 17965 gdhA 


Corynebacterium glutamicum 
AS019 pyk 


Corynebacterium glutamicum 
ATCC 13032 glk 


Corynebacterium glutamicum 
ATCC 13032 gtnA 


Corynebacterium glutamicum 
thrC 


Corynebacterium glutamicum 
ATCC 13032 ectP 


Corynebacterium glutamicum 
ATCC 13032 aceB 


'Corynebacterium glutamicum 
ATCC 13032 aceA 


Corynebacterium glutamicum 
ATCC 17965 proB 


Corynebacterium glutamicum 
AS019metB 


Corynebacterium glutamicum 
ATCC 13032 nrdl 


Corynebacterium glutamicum 
ATCC 13032 nrdH 


db Match 


CO 

co 
o 
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< 
O 

CL 

cn 


ro 

CO 
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o 
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< 
O 

CL 

cn 
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gp:AF096280_1 
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CO 
CN 
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(nt) 


2169666 


2171751 


2172154 


2194742 


2205668 


2316582 


2350259 


2353600 


2448328 


2467925 


2472035 


2496670 


2590312 


2679684 


2680419 


Initial 
(nt) 


2171741 


2172086 


2173467 


2196082 


2207092 


2317550 


2348829 


2355042 


12450172 


2470141 


2470740 


2497776 


2591469 


2680127 


2680649 
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NO 
(a.a.) 
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5 
10 


Function 


meso-diamtnopimefate D- 
dehydrogenase 


porin or cell wall channel forming 
protein 


acetate kinase 


phosphate acetyltransferase 


multidrug resistance protein or 
macrolide-efflux pump or 
drug:proton antiporter 


ATP-dependent protease regulatory 
subunit 


prephenate dehydratase 


ectoine/proline uptake protein 


15 


Matched 
length 
(aa) I 


O 
CN 
CO 


in 


r» 

CD 
CO 


CD 
CM 
CO 


cn 
in 


CM 

in 

CO 


in 

CO 


O 

in 
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o 


o 
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o 
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(%) ! 


o 
d 
o 


o 
o 
o 


100.0 


o 

d 
_o 


100.0 


100.0 


100.0 


100.0 


Co l\> 

Table 1 (continued) 


Homologous gene 


Corynebacterium glutamicum 
KY10755 ddh 


Corynebacterium glutamicum 
MH20-22B porA i 


Corynebacterium glutamicum 
ATCC 13032 ackA 


Corynebacterium glutamicum 
ATCC 13032 pta 


Corynebacterium glutamicum 
ATCC 13032 cmr 


Corynebacterium glutamicum 
ATCC 13032 cIpB 


Corynebacterium glutamicum 
pheA 


Corynebacterium glutamicum 
ATCC 13032 proP [ 


35 
40 


db Match 


sp:DDH_CORGL 


gp:CGL238703_1 


sp:ACKA_CORGL 


prf:2516394A 


prf:2309322A 


sp:CLPB_CORGL 
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to 

CO 
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prf.2501295A 
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2786756 


2887944 


2935315 


2936508 


29627 1B 


2963606 


3098578 


3272563 


50 


Initial 
(nt) 


2787715 


2888078 


2936505 


2937494 


2961342 


2966161 


3099522 


3274074 




SEQ 
NO. 
(a.a ) 


6994 
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6669 


7000 


7001 
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3500 


3501 
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ft 



Example 2 

Determination of effective mutation site 

5 (1) Identification of mutation site based on the comparison of the gene nucleotide sequence of lysine-producing B-6 
strain with that of wild type strain ATCC 13032 

[0374] Corynebacterium glutamicum B-6, which is resistant to S-(2-aminoethyl)cysteine (AEC), rifampicin, strepto- 
mycin and 6-azauracil, is a lysine-producing mutant having been mutated and bred by subjecting the wild type ATCC 

10 1 3032 strain to multiple rounds of random mutagenesis with a mutagen, N-methyl-N* -nitro-N-nitrosoguanidine (NTG) 
and screening (Appl. Microbiol. Biotechnol., 32: 269-273 (1989)). First, the nucleotide sequences of genes derived 
from the B-6 strain and considered to relate to the lysine production were determined by a method similar to the above. 
The genes relating to the lysine production include lysE and lysG which are lysine-excreting genes; ddh, dapA, horn 
and lysC (encoding diaminopimelate dehydrogenase, dihydropicolinate synthase, homoserine dehydrogenase and 

15 aspartokinase, respectively) which are lysine-biosynthetic genes; and pyc and zwf (encoding pyruvate carboxylase 
and glucose-6-phosphate dehydrogenase, respectively) which are glucose-metabolizing genes. The nucleotide se- 
quences of the genes derived from the production strain were compared with the corresponding nucleotide sequences 
of the ATCC 13032 strain genome represented by SEQ ID NOS:1 to 3501 and analyzed. As a result, mutation points 
were observed in many genes. For example, no mutation site was observed in lysE, fysG, ddh, dapA, and the like, 

20 whereas amino acid replacement mutations were found in horn, lysC, pyc, zwf, and the like. Among these mutation 
points, those which are considered to contribute to the production were extracted on the basis of known biochemical 
or genetic information. Among the mutation points thus extracted, a m utati on, Val59Ala, in horn and a mutation, 
Fro45SSer, in'pyc were evaluated whether or not the mutations were effective according to the following method. 

25 (2) Evajuatipn of mutation, Val59Ala, in rtom and mutation,,Pro458Ser, in pyc 

[0375] It is known that a mutation in horn inducing requirement or partial requirement for homoserine imparts lysine 
productivity to a wild type strain {Amino Acid Fermentation, ed. by Hiroshi Aida et af., Japan Scientific Societies Press). 
However, the relationship between the mutation, Val59Ala, in horn and lysine production is not known. It can be ex- 

30 amined whether or not the mutation, Val59Ala, in horn is an effective mutation by introducing the mutation to the wild 
type strain and examining the lysine productivity of the resulting strain. On the other hand, it can be examined whether 
or not the mutation, Pro458Ser, in pyc is effective by introducing this mutation into a lysine-producing strain which has 
a deregulated lysine-bioxynthetic pathway and is free from the pyc mutation, and comparing the lysine productivity of 
the resulting strain with the parent strain. As such a lysine-producing bacterium, No. 58 strain (FERM BP-7134) was 

35 selected (hereinafter referred to the "lysine-producing No. 58 strain" or the "No. 58 strain"). Based on the above, it was 
determined that the mutation, Val59Ala, in horn and the mutation, Pro458Ser, in pyc were introduced into the wild type 
strain of Corynebacterium glutamicum ATCC 13032 (hereinafter referred to as the "wild type ATCC 13032 strain" or 
the "ATCC 13032 strain") and the lysine-producing No. 58 strain, respectively, using the gene replacement method. A 
plasmid vector pCES30 for the gene replacement for the introduction was constructed by the following method. 

40 [0376] A plasmid vector pCE53 having a kanamycin-resistant gene and being capable of autonomously replicating 
in Coryneform bacteria (MoL Gen. Genet, 196: 175-178 (1984)) and a plasmid pMOB3 (ATCC 77282) containing a 
levansucrase gene {sacB) of Bacillus subtilis {Molecular Microbiology 6: 1195-1204 (1992)) were each digested with 
Pst\. Then, after agarose gel electrophoresis, a pCE53 fragment and a 2.6 kb DNA fragment containing sacB were 
each extracted and purified using GENECLEAN Kit (manufactured by BIO 101). The pCE53 fragment and the 2.6 kb 

45 DNA fragment were ligated using Ligation Kit ver. 2 (manufactured by Takara Shuzo), introduced into the ATCC 13032 
strain by the electroporation method (FEMS Microbiology Letters, 65: 299 (1989)), and cultured on BYG agar medium 
(medium prepared by adding 1 0 g of glucose, 20 g of peptone (manufactured by Kyokuto Pharmaceutical), 5 g of yeast 
extract (manufactured by Difco), and 16 g of Bactoagar (manufactured by Difco) to 1 liter of water, and adjusting its 
pH to 7.2) containing 25 u,g/ml kanamycin at 30°C for 2 days to obtain a transformant acquiring kanamycin-resistance. 

so As a result of digestion analysis with restriction enzymes, it was confirmed that a plasmid extracted from the resulting 
transformant by the alkali SDS method had a structure in which the 2.6 kb DNA fragment had been inserted into the 
Psfl site of pCE53. This plasmid was named pCES30. 

[0377] Next, two genes having a mutation point, horn and pyc, were amplified by PGR, and inserted into pCES30 
according to the TA cloning method (Bio Experiment Illustrated vol. 3, published by Shujunsha). Specifically, pCES30 
55 was digested with SamHI (manufactured by Takara Shuzo), subjected to an agarose gel electrophoresis, and extracted 
?.nd purified ucir.g GENECLCAN Uii (nianurcauLuitJu uy BiO iu i). i ne ootn ends of the resulting pCES30 fragment were 
blunted with DNA Blunting Kit (manufactured by Takara Shuzo) according to the attached protocol. The blunt-ended 
pCES30 fragment was concentrated by extraction with phenol/chloroform and precipitation with ethanol, and allowed 
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to react in the presence of Taq polymerase (manufactured by Roche Diagnostics) and dTTP at 70°C for 2 hours so 
that a nucleotide, thymine (T), was added to the 3'-end to prepare a T vector of pCES30. 

[0378] Separately, chromosomal DNA was prepared from the lysine-producing B-6 strain according to the method 
of Saito et al. (Biochem. Biophys. Acta, 72: 619 (1963)). Using the chromosomal DNA as a template, PGR was carried 
out with Pfu turbo DNA polymelase (manufactured by Stratagene). In the mutated horn gene, the DNAs having the 
nucleotide sequences represented by SEQ ID NOS:7002 and 7003 were used as the primer set. In the mutated pyc 
gene, the DNAs having the nucleotide sequences represented by SEQ ID NOS:7004 and 7005 were used as the primer 
set. The resulting PCR product was subjected to agarose gel electrophoresis, and extracted and purified using GENE- 
GLEAN Kit (manufactured by BIO 101). Then, the PCR product was allowed to react in the presence of Taq polymerase 
(manufactured by Roche Diagnostics) and dATP at 72°C for 10 minutes so that a nucleotide, adenine (A), was added 
to the 3'-end. 

[0379] The above pCES30 T vector fragment and the mutated horn gene (1.7 kb) or mutated pyc gene (3.6 kb) to 
which the nucleotide A had been added of the PCR product were concentrated by extraction with phenol/chloroform 
and precipitation with ethanol, and then tigated using Ligation Kit ver. 2. The ligation products were introduced into the 
ATCC 13032 strain according to the electroporation method, and cultured on BYG agar medium containing 25 u.g/ml 
kanamycin at 30°C for 2 days to obtain kanamycin-resistant transformants. Each of the resulting transformants was 
cultured overnight in BYG liquid medium containing 25 u.g/ml kanamycin, and a plasmid was extracted from the culturing 
solution medium according to the alkali SDS method. As a result of digestion analysis using restriction enzymes, it was 
confirmed that the plasmid had a structure in which the 1 .7 kb or 3.6 kb DNA fragment had been inserted into pCES30. 
The plasmids thus constructed were named respectively pChom59 and pCpyc458. 

[0380] The introduction of the mutations to the wild type ATCC 13032 strain and the lysine-producing No. 58 strain 
according to the gene replacement method was car ried out acco rding to the foll owin g method. Specifically._pChom59 
and pCpyc458 were~introduced tothe ATCC 13032 strain and the No. 58 strain, respectively, and strains in which the 
plasmid is integrated into the chromosomal DNA by homologous recombination were selected using the method of 
\tedQjrt_al.JMicroMologyjt44:. 1863 (1998)). Then, the stains in which the second homologous recombination was 
carried out were selected by a selection method, making use of the fact that the Bacillus subtifis levansucrase encoded 
by pCES30 produced a suicidal substance (J. of Bacteriof., 174: 5462 (1992)). Among the selected strains, strains in 
which the wild type horn and pyc genes possessed by the ATCC 1 3032 strain and the No. 58 strain were replaced with 
the mutated horn and pyc genes, respectively, were isolated. The method is specifically explained below. 
[0381] One strain was selected from the transformants containing the plasmid, pChom59 or pCpyc458, and the 
selected strain was cultured in BYG medium containing 20 u.g/ml kanamycin, and pCG11 (Japanese Published Exam- 
ined Patent Application No. 91827/94) was introduced thereinto by the electroporation method. pCG11 is a plasmid 
vector having a spectinomycin-resistant gene and a replication origin which is the same as pCE53. After introduction 
of the pCGII, the strain was cultured on BYG agar medium containing 20 u,g/ml kanamycin and 1 00 fig/ml spectinomycin 
at 30°C for 2 days to obtain both the kanamycin- and spectinomycin-resistant transformant. The chromosome of one 
strain of these transformants was examined by the Southern blotting hybridization according to the method reported 
by Ikeda era/. (Microbiology 144: 1863 (1998)). As a result, it was confirmed that pChom59 or pCpyc458 had been 
integrated into the chromosome by the homologous recombination of the Cambell type. In such a strain, the wild type 
and mutated horn or pyc genes are present closely on the chromosome, and the second homologous recombination 
is liable to arise therebetween. 

[0382] Each of these transformants (having been recombined once) was spread on Sue agar medium (medium 
prepared by adding 100 g of sucrose, 7 g of meat extract, 10 g of peptone, 3 g of sodium chloride, 5 g of yeast extract 
(manufactured by Difco), and 18 g of Bactoagar (manufactured by Difco) to 1 liter of water, and adjusting its pH 7.2) 
and cultured at 30°C for a day. Then the colonies thus growing were selected in each case. Since a strain in which the 
sacB gene is present converts sucrose into a suicide substrate, it cannot grow in this medium {J. Bacterioi, 174: 5462 
(1992)). On the other hand, a strain in which the sacB gene was deleted due to the second homologous recombination 
between the wild type and the mutated horn or pyc genes positioned closely to each other forms no suicide substrate 
and, therefore, can grow in this medium. In the homologous recombination, either the wild type gene or the mutated 
gene is deleted together with the sacB gene. When the wild type is deleted together with the sacB gene, the gene 
replacement into the mutated type arises. 

[0383] Chromosomal DNA of each the thus obtained second recombinants was prepared by the above method of 
Saito et at. PCR was carried out using Pfu turbo DNA polymerase (manufactured by Stratagene) and the attached 
buffer. In the horn gene, DNAs having the nucleotide sequences represented by SEQ ID NOS:7002 and 7003 were 
used as the primer set. Also, in the pyc gene was used, DNAs having the nucleotide sequences represented by SEQ 
ID NOS:7004 and 7005 were used as the primer set. The nucleotide sequences of the PCR products were determined 
by t.^c ccr.vGr.tiorial rnel! iuu &u mat ii was judged wnetner tne nom or pyc gene of the second recombinant was a wild 
type or a mutant. As a result, the second recombinant which were called HD-1 and No. 58pyc were target strains having 
the mutated horn gene and pyc gene, respectively. 
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(3) Lysine production test of HD-1 and No. 58pyc strains 

[0384] The HD-1 strain (strain obtained by incorporating the mutation, Val59Ala, in the horn gene into the ATCC 
13032 strain) and the No. 58pyc strain (strain obtained by incorporating the mutation, Pro458Ser, in the pyc gene into 
the lysine-producing No. 58 strain) were subjected to a culture test in a 5 I jar fermenter by using the ATCC 13032 
strain and the lysine-producing No. 58 strain respectively as a control. Thus lysine production was examined. 
[0385] After culturing on BYG agar medium at 30°C for 24 hours, each strain was inoculated into 250 ml of a seed 
medium (medium prepared by adding 50 g of sucrose, 40 g of corn steep liquor, 8.3 g of ammonium sulfate, 1 g of 
urea, 2 g of potassium dihydrogenphosphate, 0.83 g of magnesium sulfate heptahydrate, 10 mg of iron sulfate hep- 
tahydrate, 1 mg of copper sulfate pentahydrate, 10mg of zinc sulfate heptahydrate, 10 mg of (5-alanine, 5 mg of nicotinic 
acid, 1.5 mg of thiamin hydrochloride, and 0.5 mg of biotin to 1 liter of water, and adjusting its pH to 7.2, then to which 
30 g of calcium carbonate had been added) contained in a 2 1 buffle-attached Erlenmeyer flask and cultured therein 
at 30°C for 1 2 to 1 6 hours. A total amount of the seed culturing medium was inoculated into 1 ,400 ml of a main culture 
medium (medium prepared by adding 60 g of glucose, 20 g of corn steep liquor, 25 g of ammonium chloride, 2.5 g of 
potassium dihydrogenphosphate, 0.75 g of magnesium sulfate heptahydrate, 50 mg of iron sulfate heptahydrate, 13 
mg of manganese sulfate pentahydrate, 50 mg of calcium chloride, 6.3 mg of copper sulfate pentahydrate, 1.3 mg of 
zinc sulfate heptahydrate, 5 mg of nickel chloride hexahydrate, 1.3 mg of cobalt chloride hexahydrate, 1.3 mg of am- 
monium molybdenate tetrahydrate, 14 mg of nicotinic acid, 23 mg of (3-alanine, 7 mg of thiamin hydrochloride, and 
0.42 mg of biotin to 1 liter of water) contained in a 5 1 jar fermenter and cultured therein at 32°C, 1 wm and 800 rpm 
while controlling the pH to 7.0 with aqueous ammonia. When glucose in the medium had been consumed, a glucose 
feeding solution (medium prepared by adding 400 g glucose and 45 g of ammonium chloride to 1 liter of water) was 
continuously added. The addition of feeding solution was carried out atja controlled s peed so as to jTTajntainJbejjis- 
soiveu oxygen concentration within a range of 075 To 3~ppm. After culturing for 29 hours, the culture was terminated. 
The cells were separated from the culture medium by centrifugation and then L-lysine hydrochloride in the supernatant 
was quantified by high performance liquid chromatography (HPLC). The results are shown in Table 2 below. - 



Table 2 


Strain 


L-Lysine hydrochloride yield (g/l) 


ATCC 13032 


0 


HD-1 


8 


No. 58 


45 


No. 58pyc 


51 



[0386] As is apparent from the results shown in Table 2, the lysine productivity was improved by introducing the 
mutation, Val59Ala, in the horn gene or the mutation, Pro458Ser, in the pyc gene. Accordingly, it was found that the 
mutations are both effective mutations relating to the production of lysine. Strain, AHP-3, in which the mutation, 
Val59Ala, in the horn gene and the mutation, Pro458Ser, in the pyc gene have been introduced into the wild type ATCC 
13032 strain together with the mutation, Thr331lle in the lysC gene has been deposited on December 5, 2000, in 
National Institute of Bioscience and Human Technology, Agency of Industrial Science and Technology (Higashi 1-1-3, 
Tsukuba-shi, Ibaraki, Japan) as FERM BP-7382. 

Example 3 

Reconstruction of lysine-producing strain based on genome information 

[0387] The lysine-producing mutant B-6 strain {Appl. Microbiol. Biotechnot., 32: 269-273 (1989)), which has been 
constructed by multiple round random mutagenesis with NTG and screening from the wild type ATCC 13032 strain, 
produces a remarkably large amount of lysine hydrochloride when cultured in a jar at 32°C using glucose as a carbon 
source. However, since the fermentation period is long, the production rate is less than 2.1 g/l/h. Breeding to reconstitute 
only effective mutations relating to the production of lysine among the estimated at least 300 mutations introduced into 
the B-6 strain in the wild type ATCC 13032 strain was performed. 

(1) Identification of mutation point and effective mutation by comparing the gene nucleotide sequence of the B-6 strain 
with that of the ATCC 13032 strain 

[0388] As described above, the nucleotide sequences of genes derived from the B-6 strain were compared with the 
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corresponding nucleotide sequences of the ATCC 13032 strain genome represented by SEQ ID NOS:1 to 3501 and 
analyzed to identify many mutation points accumulated in the chromosome of the B-6 strain. Among these, a mutation, 
Val591Ala, in horn, a mutation, Thr311lle, in lysC, a mutation, Pro458Ser, in pyc and a mutation, Ata213Thr, in zwf 
were specified as effective mutations relating to the production of lysine. Breeding to reconstitute the 4 mutations in 
the wild type strain and for constructing of an industrially important lysine-producing strain was carried out according 
to the method shown below. 

(2) Construction of plasmid for gene replacement having mutated gene 

[0389] The plasmid for gene replacement, pChom59, having the mutated horn gene and the plasmid for gene re- 
placement, pCpyc458, having the mutated pyc gene were prepared in the above Example 2(2). Plasmids for gene 
replacement having the mutated lysC and zwf were produced as described below. 

[0390] The fysC and zwf having mutation points were amplified by PCR, and inserted into a plasmid for gene re- 
placement, pCES30, according to the TA cloning method described in Example 2(2) (Bio Experiment Illustrated, Vol. 3). 
[0391] Separately, chromosomal DNA was prepared from the lysine-producing B-6 strain according to the above 
method of Saito et a/. Using the chromosomal DNA as a template, PCR was carried out with Pfu turbo DNA polymerase 
(manufactured by Stratagene). In the mutated fysC gene, the DNAs having the nucleotide sequences represented by 
SEQ ID NOS:7006 and 7007 were used as the primer set. In the mutated zwf gene, the DNAs having the nucleotide 
sequences represented by SEQ ID NOS:7008 and 7009 as the primer set. The resulting PCR product was subjected 
to agarose gel electrophoresis, and extracted and purified using GENEGLEAN Kit (manufactured by BIO 101). Then, 
the PCR product was allowed to react in the presence of Taq DNA polymerase (manufactured by Roche Diagnostics) 
and dATP at 72°C for 10 minutes so that a nucle otide, a denine (A), was added to the 3'-end. 

[0392]" The'above pCESao T vector fragmenf and the mutated lysC gene (1.5 kb) or mutated zwf gene (2.3 kb) to 
which the nucleotide A had been added of the PCR product were concentrated by extraction with phenol/chloroform 
and precipitat[pn with etha_npl, and then ligated,using=Ligation Kit ver. 2.=The ligation products were introduced into the" 
ATCC 13032 strain according to the electroporation method, and cultured on BYG agar medium containing 25 u.g/ml 
kanamycin at 30°C for 2 days to obtain kanamycin-resistant transformants. Each of the resulting transformants was 
cultured overnight in BYG liquid medium containing 25 u.g/ml kanamycin, and a plasmid was extracted from the culturing 
solution medium according to the alkali SDS method. As a result of digestion analysis using restriction enzymes, it was 
confirmed that the plasmid had a structure in which the 1 .5 kb or 2.3 kb DNA fragment had been inserted into pCES30. 
The plasmids thus constructed were named respectively pClysC311 and pCzwf213. 

(3) Introduction of mutation, Thr311lle, in lysC into one point mutant HD-1 

[0393] Since the one mutation point mutant HD-1 in which the mutation, Val59Ala, in horn was introduced into the 
wild type ATCC 13032 strain had been obtained in Example 2(2), the mutation, Thr311lle, in lysC was introduced into 
the HD-1 strain using pC!ysC311 produced in the above (2) according to the gene replacement method described in 
Example 2(2). PCR was carried out using chromosomal DNA of the resulting strain and, as the primer set, DNAs having 
the nucleotide sequences represented by SEQ ID NOS:7006 and 7007 in the same manner as in Example 2(2). As a 
result of the fact that the nucleotide sequence of the PCR product was determined in the usual manner, it was confirmed 
that the strain which was named AHD-2 was a two point mutant having the mutated lysC gene in addition to the mutated 
horn gene. 

(4) Introduction of mutation, Pro458Ser, in pyc into two point mutant AHD-2 

[0394] The mutation, Pro458Ser, in pyc was introduced into the AHD-2 strain using the pCpyc458 produced in Ex- 
ample 2(2) by the gene replacement method described in Example 2(2). PCR was carried out using chromosomal 
DNA of the resulting strain and, as the primer set, DNAs having the nucleotide sequences represented by SEQ ID 
NOS:7004 and 7005 in the same manner as in Example 2(2). As a result of the fact that the nucleotide sequence of 
the PCR product was determined in the usual manner, it was confirmed that the strain which was named AHD-3 was 
a three point mutant having the mutated pyc gene in addition to the mutated horn gene and lysC gene. 

(5) Introduction of mutation, Ala213Thr, in zwf into three point mutant AHP-3 

[0395] The mutation, Ala213Thr, in zwf was introduced into the AHP-3 strain using the pCzwf458 produced in the 
above (2) by the gcr.a raplacei uci n mcii iuu described in txampie 2(2). PCR was carried out using chromosomal DNA 
of the resulting strain and, as the primer set, DNAs having the nucleotide sequences represented by SEQ ID NOS: 
7008 and 7009 in the same manner as in Example 2(2). As a result of the fact that the nucleotide sequence of the PCR 
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product was determined in the usual manner, it was confirmed that the strain which was named APZ-4 was a four point 
mutant having the mutated zwfgene in addition to the mutated horn gene, fysC gene and pyc gene. 

(6) Lysine production test on HD-1, AHD-2, AHP-3 and APZ-4 strains 

[0396] The HD-1, AHD-2, AHP-3 and APZ-4 strains obtained above were subjected to a culture test in a 5 I jar 
fermenter in accordance with the method of Example 2(3). 
[0397] Table 3 shows the results. 



Table 3 



Strain 


L-Lysine hydrochloride (g/l) 


Productivity (g/l/h) 


HD-1 


8 


0.3 


AHD-2 


73 


2.5 


AHP-3 


80 


2.8 


APZ-4 


86 


3.0 



[0398] Since the lysine-producing mutant B-6 strain which has been bred based on the random mutation and selection 
shows a productivity of less than 2.1 g/l/h, the APZ-4 strain showing a high productivity of 3.0 g/l/h is useful in industry. 

(7) Lysine fermentation by APZ-4 strain at high temperature 

[0399] The APZ-4 strain, which had been reconstructed by introducing 4 effective mutations into the wild type strain, 
was subjected to the culturing test in a 5 1 jar fermenter in the same manner as in Example 2(3), except that the culturing 
temperature was changed to 40°C. 
[0400] The results are shown in Table 4. 



Table 4 



Temperature (°C) 


L-Lysine hydrochloride (g/l) 


Productivity (g/l/h) 


32 


86 


3.0 


40 


95 


3.3 



[0401] As is apparent from the results shown in Table 4, the lysine hydrochloride titer and productivity in culturing at 
a high temperature of 40°C comparable to those at 32°C were obtained. In the mutated and bred lysine-producing B- 
6 strain constructed by repeating random mutation and selection, the growth and the lysine productivity are lowered 
at temperatures exceeding 34°C so that lysine fermentation cannot be carried out, whereas lysine fermentation can 
be carried out using the APZ-4 strain at a high temperature of 40°C so that the load of cooling is greatly reduced and 
it is industrially useful. The lysine fermentation at high temperatures can be achieved by reflecting the high temperature 
adaptability inherently possessed by the wild type strain on the APZ-4 strain. 

[0402] As demonstrated in the reconstruction of the lysine-producing strain, the present invention provides a novel 
breeding method effective for eliminating the problems in the conventional mutants and acquiring industrially advan- 
tageous strains. This methodology which reconstitutes the production strain by reconstituting the effective mutation is 
an approach which is efficiently carried out using the nucleotide sequence information of the genome disclosed in the 
present invention, and its effectiveness was found for the first time in the present invention. 

Example 4 

Production of DNA microarray and use thereof 

[0403] A DNA microarray was produced based on the nucleotide sequence information of the ORF deduced from 
the full nucleotide sequences of Corynebactehum glutamicum ATCC 13032 using software, and genes of which ex- 
pression is fluctuated depending on the carbon source during culturing were searched. 

(1) Production of DNA microarray 

[0404] Chromosomal DNA was prepared from Corynebactehum glutamicum ATCC 1 3032 by the method of Saito et 
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al. ( Biochem. Biophys. Acta, 72: 619 (1963)). Based on 24 genes having the nucleotide sequences represented by 
SEQ ID NOS:207 t 3433, 281, 3435, 3439, 765, 3445, 1226, 1229, 3448, 3451, 3453, 3455, 1743, 3470, 2132, 3476, 
3477, 3485, 3488, 3489, 3494, 3496, and 3497 from the ORFs shown in Table 1 deduced from the full genome nucle- 
otide sequence of Corynebacterium gfutamicum ATCC 13032 using software and the nucleotide sequence of rabbit 
globin gene (GenBank Accession No. V00882) used as an internal standard, oligo DNA primers for PCR amplification 
represented by SEQ ID NOS:7010 to 7059 targeting the nucleotide sequences of the genes were synthesized in a 
usual manner. 

[0405] As the oligo DNA primers used for the PCR, 

[0406] DNAs having the nucleotide sequence represented by SEQ ID NOS:7010 and 7011 were used for the ampli- 
fication of the DNA having the nucleotide sequence represented by SEQ ID NO:207, 

[0407] DNAs having the nucleotide sequence represented by SEQ ID NOS:7012 and 7013 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3433, 

[0408] DNAs having the nucleotide sequence represented by SEQ JD NOS:7014 and 7015 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:281, 

[0409] DNAs having the nucleotide sequence represented by SEQ ID NOS:7016 and 7017 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3435, 

[0410] DNAs having the nucleotide sequence represented by SEQ ID NOS:7018 and 7019 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3439, 

[0411] DNAs having the nucleotide sequence represented by SEQ ID NOS:7020 and 7021 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:765, 

[0412] DNAs having the nucleotide sequence represented by SEQ ID NOS:7022 and 7023 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3445, 

[041-3] -DNAs havingirie nucieoiidesequence repTe^ehted b^SEQ lD^OS-702'Tand 7025 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO: 1226, 

I 04 ! 4 ! J? N ^5 having the nucleotide sequence__represented by,SEQJD,NOS:7026=and=7027 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:1229, 

[0415] DNAs having the nucleotide sequence represented by SEQ ID NOS:7028 and 7029 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3448 ( 

[0416] DNAs having the nucleotide sequence represented by SEQ ID NOS:7030 and 7031 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3451, 

[0417] DNAs having the nucleotide sequence represented by SEQ ID NOS:7032 and 7033 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3453, 

[0418] DNAs having the nucleotide sequence represented by SEQ ID NOS:7034 and 7035 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3455, 

[0419] DNAs having the nucleotide sequence represented by SEQ ID NOS:7036 and 7037 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:1743, 

[0420] DNAs having the nucleotide sequence represented by SEQ ID NOS:7038 and 7039 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3470, 

[0421] DNAs having the nucleotide sequence represented by SEQ ID NOS:7040 and 7041 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:2132, 

[0422] DNAs having the nucleotide sequence represented by SEQ ID NOS:7042 and 7043 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3476, 

[0423] DNAs having the nucleotide sequence represented by SEQ ID NOS:7044 and 7045 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3477, 

[0424] DNAs having the nucleotide sequence represented by SEQ ID NOS:7046 and 7047 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3485, 

[0425] DNAs having the nucleotide sequence represented by SEQ ID NOS:7048 and 7049 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3488, 

[0426] DNAs having the nucleotide sequence represented by SEQ ID NOS:7050 and 7051 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3489, 

[0427] DNAs having the nucleotide sequence represented by SEQ ID NOS:7052 and 7053 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3494, 

[0428] DNAs having the nucleotide sequence represented by SEQ ID NOS:7054 and 7055 were used for the am- 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3496, 

[0429] DNAs having the nucleotide sequence represented by SEQ ID NOS:7056 and 7057 were used for the am- 
plification nf the r>NA havlns the rvjclcoilda sequent icpiesenied by Stu ID NO:3497, and 

[0430] DNAs having the nucleotide sequence represented by SEQ ID NOS:7058 and 7059 were used for the am- 
plification of the DNA having the nucleotide sequence of the rabbit globin gene, 
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as the respective primer set. 

[0431] The PCR was carried for 30 cycles with each cycle consisting of 15 seconds at 95°C and 3 minutes at 68°C 
using a thermal cycler (GeneAmp PCR system 9600, manufactured by Perkin Elmer), TaKaRa EX-Taq (manufactured 
by Takara Shuzo), 100 ng of the chromosomal DNA and the buffer attached to the TaKaRa Ex-Taq reagent. In the case 
of the rabbit globin gene, a single-stranded cDNA which had been synthesized from rabbit gtobin mRNA (manufactured 
by Life Technologies) according to the manufacture's instructions using a reverse transcriptase RAV-2 (manufactured 
by Takara Shuzo). The PCR product of each gene thus amplified was subjected to agarose gel electrophoresis and 
extracted and purified using QIAquick Ge! Extraction Kit (manufactured by QIAGEN). The purified PCR product was 
concentrated by precipitating it with ethanol and adjusted to a concentration of 200 ng/uJ. Each PCR product was 
spotted on a slide glass plate (manufactured by Matsunami Glass) having MAS coating in 2 runs using GTMASS 
SYSTEM (manufactured by Nippon Laser & Electronics Lab.) according to the manufacture's instructions. 

(2) Synthesis of fluorescence labeled cDNA 

[0432] The ATCC 13032 strain was spread on BY agar medium (medium prepared by adding 20 g of peptone (man- 
ufactured by Kyokuto Pharmaceutical), 5 g of yeast extract (manufactured by Difco), and 16 g of Bactoagar (manufac- 
tured by Difco) to in 1 liter of water and adjusting its pH to 7.2) and cultured at 30°C for 2 days. Then, the cultured 
strain was further inoculated into 5 ml of BY liquid medium and cultured at 30°C overnight. Then, the cultured strain 
was further inoculated into 30 ml of a minimum medium (medium prepared by adding 5 g of ammonium sulfate, 5 g of 
urea, 0.5 g of monopotassium dihydrogenphosphate, 0.5 g of dipotassium monohydrogenphosphate, 20.9 g of mor- 
pholinopropanesulfonic acid, 0.25 g of magnesium sulfate heptahydrate, 10 mg of calcium chloride dihydrate, 10 mg 
of manganese sulfate monohydrate, 10 mg of ferrous sulfate heptahydrate, 1 mg of zinc sulfate heptahydrate, 0.2_mg 
.cop per -sulfate, -and 0,2-rng-biGtirrto -1-liter of -water; and adjusiing _ ]tsH5H f^ 110 mmol/Tglucose or 200 

mmol/l ammonium acetate, and cultured in an Erlenmyer flask at 30° to give 1.0 of absorbance at 660 nm. After the 
cells were prepared by centrifugingat 4°C and 5,000 rpm for 10 minutes, total RNA was prepared from the resulting 
cells^accordirTg teethe method of Bormann etal. ( Molecular Microbiology, 6: 317-326 (1992)). To avoid contamination 
with DNA, the RNA was treated with Dnasel (manufactured by Takara Shuzo) at 37°C for 30 minutes and then further 
purified using Qiagen RNeasy MiniKit (manufactured by QIAGEN) according to the manufacture's instructions. To 30 
uo, of the resulting total RNA, 0.6 u.l of rabbit globin mRNA (50 ng/u.1, manufactured by Life Technologies) and 1 u,l of 
a random 6 mer primer (500 ng/uJ, manufactured by Takara Shuzo) were added for denaturing at 65°C for 10 minutes, 
followed by quenching on ice. To the resulting solution, 6 u.l of a buffer attached to Superscript II (manufactured by 
Lifetechnologies), 3 uJ of 0.1 mol/l DTT, 1.5 uJ of dNTPs (25 mmol/l dATP, 25 mmol/l dCTP, 25 mmol/l dGTP, 10 mmol/ 
I dTTP), 1 .5 uJ of Cy5-dUTP or Cy3-dUTP (manufactured by NEN) and 2 uJ of Superscript II were added, and allowed 
to stand at 25°C for 10 minutes and then at42°C for 110 minutes. The RNA extracted from the cells using glucose as 
the carbon source and the RNA extracted from the cells using ammonium acetate were labeled with Cy5-dUTP and 
Cy3-dUTP, respectively. After the fluorescence labeling reaction, the RNA was digested by adding 1.5 ul of 1 mol/l 
sodium hydroxide-20 mmol/l EDTA solution and 3.0 u.l of 10% SDS solution, and allowed to stand at 65°C for 10 
minutes. The two cDNA solutions after the labeling were mixed and purified using Qiagen PCR purification Kit (man- 
ufactured by QIAGEN) according to the manufacture's instructions to give a volume of 10 |xl. 

(3) Hybridization 

[0433] UltraHyb (110 uJ) (manufactured by Ambion) and the fluorescence-labeled cDNA solution (10 uJ) were mixed 
and subjected to hybridization and the subsequent washing of slide glass using GeneTAC Hybridization Station (man- 
ufactured by Genomic Solutions) according to the manufacture's instructions. The hybridization was carried out at 
50°C, and the washing was carried out at 25°C. 

(4) Fluorescence analysis 

[0434] The fluorescence amount of each DNA array having the fluorescent cDNA hybridized therewith was measured 
using ScanArray 4000 (manufactured by GSI Lumonics). 

[0435] Table 5 shows the Cy3 and Cy5 signal intensities of the genes having been corrected on the basis of the data 
of the rabbit globin used as the internal standard and the Cy3/Cy5 ratios. 



Table 5 



SEC ID NO 


uyo ii ittJl 151 iy 


oyo intensity 


Cy3/Cy5 


207 


5248 


3240 


1.62 
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Table 5 (continued) 



10 



20 



25 



SEQ ID NO 


Cy3 intensity 


Cy5 intensity 


Cy3/Cy5 


3433 


2239 


2694 


0.83 


281 


2370 


2595 


0.91 


3435 


2566 


2515 


1.02 


3439 


5597 


6944 


0.81 


765 


6134 


4943 


1.24 


3455 


1169 


1284 


0.91 


1226 


1301 


1493 


0.87 


1229 


1168 


1131 


1.03 


3448 


1187 


1594 


0.74 


3451 


2845 


3859 


0.74 


3453 


3498 


1705 


2.05 


3455 


1491 


1144 


1.30 


1743 


1972 


1841 


1.07 


3470 


4752 


3764 


1.26 




n / o 


1085 


1 .08 


3476 


1847 


1420 


1.30 


3477 


1284 


1164 


1.10 


3485 


_4539 


8014 


0.57 


3488 


34289 


1398 


24.52 


3489 


43645 


1497 


29.16 


3494 


3199 


2503 


1.28 


3496 


3428 


2364 


1.45 


3497 


3848 


3358 


1.15 



[0436] The ORF function data estimated by using software were searched for SEQ ID NOS:3488 and 3489 showing 
remarkably strong Cy3 signals. As a result, it was found that SEQ ID NOS:3488 and 3489 are a maleate synthase 
gene and an isocitrate lyase gene, respectively. It is known that these genes are transcriptionally induced by acetic 
acid in Corynebacterium glutamicum (Archives of Microbiology, 168: 262-269 (1997)). 

[0437] As described above, a gene of which expression is fluctuates could be discovered by synthesizing appropriate 
oligo DNA primers based on the ORF nucleotide sequence information deduced from the full genomic nucleotide 
sequence information of Corynebacterium glutamicum ATCC 13032 using software, amplifying the nucleotide sequenc- 
es of the gene using the genome DNA of Corynebacterium glutamicum as a template in the PCR reaction, and thus 
producing and using a DNA microarray. 

[0438] This Example shows that the expression amount can be analyzed using a DNA microarray in the 24 genes. 
On the other hand, the present DNA microarray techniques make it possible to prepare DNA microarrays having thereon 
several thousand gene probes at once. Accordingly, it is also possible to prepare DNA microarrays having thereon all 
of the ORF gene probes deduced from the full genomic nucleotide sequence of Corynebacterium glutamicum ATCC 
13032 determined by the present invention, and analyze the expression profile at the total gene level of Corynebac- 
terium glutamicum using these arrays. 

Example 5 

Homology search using Corynebacterium glutamicum genome sequence 
(1) Search of adenosine deaminase 

[0439] The amino acid sequence (ADD.ECOLI) of Escherichia coli adenosine deaminase was obtained from Swiss- 
prot Database as the amino acid sequence of the protein of which function had been confirmed as adenosine deaminase 
(EC3. 5.4.4). By using the full length of this amino acid sequence as a query, a homology search was carried out on a 
nucleotide sequence database of the genome sequence of Corynebacterium glutamicum or a database of the amino 
aclo'a in me ORF region deduceo Trom tne genome sequence using FASTA program (Proc. Natl. Acad. Sci. ISA, 85: 
2444-2448 (1988)). A case where E-value was le" 10 or less was judged as being significantly homologous. As a result, 
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no sequence significantly homologous with the Escherichia coli adenosine deaminase was found in the nucleotide 
sequence database of the genome sequence of Corynebacterium glutamicum or the database of the amino acid se- 
quences in the ORF region deduced from the genome sequence. Based on these results, it is assumed that Coryne- 
bacterium glutamicum contains no ORF having adenosine deaminase activity and thus has no activity of converting 
5 adenosine into inosine. 

(2) Search of glycine cleavage enzyme 

[0440] The sequences (GCSP_ECOLI, GCST_ECOLI and GCSH.ECOLI) of glycine decarboxylase, aminomethyl 
10 transferase and an aminomethyl group carrier each of which is a component of Escherichia coli glycine cleavage 
enzyme as the amino acid sequence of the protein, of which function had been confirmed as glycine cleavage enzyme 
(EC2.1.2.10), were obtained from Swiss-prot Database. 

[0441 ] By using these full-length amino acid sequences as a query, a homology search was carried out on a nucleotide 
sequence database of the genome sequence of Corynebacterium glutamicum or a database of the ORF amino acid 

15 sequences deduced from the genome sequence using FASTA program. A case where E-value was le* 10 or less was 
judged as being significantly homologous. As a result, no sequence significantly homologous with the glycine decar- 
boxylase, the aminomethyl transferase or the aminomethyl group carrier each of which is a component of Escherichia 
coli glycine cleavage enzyme, was found in the nucleotide sequence database of the genome sequence of Coryne- 
bacterium glutamicum or the database of the ORF amino acid sequences estimated from the genome sequence. Based 

20 on these results, it is assumed that Corynebacterium glutamicum contains no ORF having the activity of glycine de- 
carboxylase, aminomethyl transferase or the aminomethyl group carrier and thus has no activity of the glycine cleavage 
enzyme. 

(3) Search of IMP dehydrogenase 

25 

[0442] the amino acid sequence (IMDH ECOLI) of Escherichia coli IMP dehydrogenase as the amino acid sequence 
of the protein, of which function had been confirmed as IMP dehydrogenase (EC1.1.1.205), was obtained from Swiss- 
prot Database. By using the full length of this amino acid sequence as a query, a homology search was carried out on 
a nucleotide sequence database of the genome sequence of Corynebacterium glutamicum or a database of the ORF 

30 amino acid sequences predicted from the genome sequence using FASTA program. A case where E-value was le -10 
or less was judged as being significantly homologous. As a result, the amino acid sequences encoded by two ORFs, 
namely, an ORF positioned in the region of the nucleotide sequence No. 615336 to 616853 (or ORF having the nucle- 
otide sequence represented by SEQ ID NO:672) and another ORF positioned in the region of the nucleotide sequence 
No. 616973 to 618094 (or ORF having the nucleotide sequence represented by SEQ ID NO:674) were significantly 

35 homologous with the ORFs of Escherichia coli IMP dehydrogenase. By using the above-described predicted amino 
acid sequence as a query in order to examine the similarity of the amino acid sequences encoded by the ORFs with 
IMP dehydrogenases of other organisms in greater detail, a search was carried out on GenBank (http://www.ncbi.nlm. 
nih.gov/) nr-aa database (amino acid sequence database constructed on the basis of GenBankCDS translation prod- 
ucts, PDB database, Swiss-Prot database, PIR database, PRF database by eliminating duplicated registrations) using 

40 BLAST program. As a result, both of the two amino acid sequences showed significant homologies with IMP dehdy- 
rogenases of other organisms and clearly higher homologies with IMP dehdyrogenases than with amino acid sequences 
of other proteins, and thus, it was assumed that the two ORFs would function as IMP dehydrogenase. Based on these 
results, it was therefore assumed that Corynebacterium glutamicum has two ORFs having the IMP dehydrogenase 
activity. 

45 

Example 6 

Proteome analysis of proteins derived from Corynebacterium glutamicum 

50 (1) Preparations of proteins derived from Corynebacterium glutamicum ATCC 13032, FERM BP-7134 and FERM BP- 
158 

[0443] Culturing tests of Corynebacterium glutamicum ATCC 1 3032 (wild type strain), Corynebacterium glutamicum 
FERM BP-7134 (lysine-producing strain) and Corynebacterium glutamicum (FERM BP-158, lysine-highly producing 
55 strain) were carried out in a 5 I jar fermenter according to the method in Example 2(3). The results are shown in Table 6. 
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Table 6 



30 



50 



Strain 


L-Lysine yield (g/l) 


ATCC 13032 


0 


FERM BP-7134 


45 


FERM BP-158 


60 



[0444] After culturing, cells of each strain were recovered by centrifugation. These cells were washed with Tris-HCI 
to buffer ( 1 0 mmol/l Tris-HCI, pH 6.5, 1 .6 mg/ml protease inhibitor (COMPLETE; manufactured by Boehringer Mannheim)) 
three times to give washed cells which could be stored under freezing at -80°C. The freeze-stored cells were thawed 
before use, and used as washed cells. 

[0445] The washed cells described above were suspended in a disruption buffer (1 0 mmol/l Tris-HCI, pH 7.4, 5 mmol/ 
I magnesium chloride, 50 mg/l RNase, 1.6 mg/ml protease inhibitor (COMPLETE: manufactured by Boehringer Man- 
's nheim)), and disrupted with a disruptor (manufactured by Brown) under cooling. To the resulting disruption solution, 
DNase was added to give a concentration of 50 mg/l, and allowed to stand on ice for 10 minutes. The solution was 
centrifuged (5,000 x g, 1 5 minutes, 4°C) to remove the undisrupted cells as the precipitate, and the supernatant was 
recovered. 

[0446] To the supernatant, urea was added to give a concentration of 9 mol/l, and an equivalent amount of a lysis 
20 buffer (9.5 mol/l urea, 2% NP-40, 2% Ampholine, 5% mercaptoethanol, 1.6 mg/ml protease inhibitor (COMPLETE; 
manufactured by Boehringer Mannheim) was added thereto, followed by thoroughly stirring at room temperature for 
dissolving. 

[0447] After-being-dissolved, the soiuiiorrwas centrifuged at 12,000 x g for 15 minutes, and the supernatant was 
recovered. 

25 [0448] To the supernatant, ammonium sujfate^was^added to the extent or80% saturation, followed by thoroughly 
stirring for dissolving. 

[0449] After being dissolved, the solution was centrifuged (16,000 x g, 20 minutes, 4°C), and the precipitate was 
recovered. This precipitate was dissolved in the lysis buffer again and used in the subsequent procedures as a protein 
sample. The protein concentration of this sample was determined by the method for quantifying protein of Bradford. 



(2) Separation of protein by two dimensional electrophoresis 



[0450] The first dimensional electrophoresis was carried out as described below by the isoelectric electrophoresis 
method. 

35 [0451] A molded dry IPG strip gel (pH 4-7, 13 cm, Immobiline DryStrips; manufactured by Amersham Pharmacia 
Biotech) was set in an electrophoretic apparatus (Multiphor II or IPGphor; manufactured by Amersham Pharmacia 
Biotech) and a swelling solution (8 mol/l urea, 0.5% Triton X-100, 0.6% dithiothreitol, 0.5% Ampholine, pH 3-10) was 
packed therein, and the gel was allowed to stand for swelling 12 to 16 hours. 

[0452] The protein sample prepared above was dissolved in a sample solution (9 mol/l urea, 2% CHAPS, 1% dithi- 
40 othreitol, 2% Ampholine, pH 3-10), and then about 100 to 500 u,g (in terms of protein) portions thereof were taken and 
added to the swollen IPG strip gel. 

[0453] The electrophoresis was carried out in the 4 steps as defined below under controlling the temperature to 20°C: 

step 1: 1 hour under a gradient mode of 0 to 500V; 
45 step 2: 1 hour under a gradient mode of 500 to 1 ,000 V; 

step 3: 4 hours under a gradient mode of 1 ,000 to 8,000 V; and 
step 4: 1 hour at a constant voltage of 8,000 V. 



[0454] After the isoelectric electrophoresis, the IPG strip gel was put off from the holder and soaked in an equilibration 
buffer A (50 mmol/l Tris-HCI, pH 6.8, 30% glycerol, 1% SDS, 0.25% dithiothreitol) for 15 minutes and another equili- 
bration buffer B (50 mmol/l Tris-HCI, pH 6.8, 6 mol/l urea, 30% glycerol, 1 % SDS, 0.45% iodo acetamide) for 1 5 minutes 
to sufficiently equilibrate the gel. 

[0455] After the equilibrium, the IPG strip gel was lightly rinsed in an SDS electrophoresis buffer (1 .4% glycine, 0.1% 
SDS, 0.3% Tris-HCI, pH 8.5), and the second dimensional electrophoresis depending on molecular weight was carried 
55 out as described below to separate the proteins. 

[0456] Specifically, the ahnv« IPO Qtrin 2 e\ w^s c!occ!y pieced or, 14% polypi yiamide siub gei p4% poiyacrylamide, 
0.37% bisacrylamide, 37.5 mmol/l Tris-HCI, pH 8.8, 0.1% SDS, 0.1% TEMED, 0.1% ammonium persulfate) and sub- 
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jected to electrophoresis under a constant voltage of 30 mA at 20°C for 3 hours to separate the proteins. 

(3) Detection of protein spot 

5 [0457] Coomassie staining was performed by the method of Gorg et al. (Electrophoresis, 9: 531-546 (1988)) for the 
slub gel after the second dimensional electrophoresis. Specifically, the slub gel was stained under shaking at 25°C for 
about 3 hours, the excessive coloration was removed with a decoloring solution, and the gel was thoroughly washed 
with distilled water. 

[0458] The results are shown in Fig. 2. The proteins derived from the ATCC 13032 strain (Fig. 2A), FERM BP-7134 
10 strain (Fig. 2B) and FERM BP-158 strain (Fig. 2C) could be separated and detected as spots. 

(4) In-gel digestion of detected protein spot 

[0459] The detected spots were each cut out from the gel and transferred into siliconized tube, and 400 u.l of 100 
15 mmol/1 ammonium bicarbonate : acetonitrile solution (1:1, v/v) was added thereto, followed by shaking overnight and 
freeze-dried as such. To the dried gel, 10 u.1 of a lysylendopeptidase (LysC) solution (manufactured by WAKO, prepared 
with 0.1% SDS-containing 50 mmol/l ammonium bicarbonate to give a concentration of 100 ng/ul) was added and the 
gel was allowed to stand for swelling at 0°C for 45 minutes, and then allowed to stand at 37°C for 16 hours. After 
removing the LysC solution, 20 uJ of an extracting solution (a mixture of 60% acetonitrile and 5% formic acid) was 
20 added, followed by ultrasonication at room temperature for 5 minutes to disrupt the gel. After the disruption, the extract 
was recovered by centrifugation (12,000 rpm, 5 minutes, room temperature). This operation was repeated twice to 
recover the whole extract. The recovered extract was concentrated by centrifugation in vacuoXo halve the liquid volume. 
To-the concentrater2Gixi"OfO:T%"irifiuoroaceiic acid wWa^delj; followed bythoroughly stirring^ and The mixture was 
subjected to desalting using ZipTip (manufactured by Mi Mi pore).. The protein absorbed on the carriers of ZipTip was 
25 eluted with 5 \i\ of a-cyano-4-hydroxycinnamic acid for use as a sample solution for.analysis. 

(5) Mass spectrometry and amino acid sequence analysis of protein spot with matrix assisted laser desorption ionization 
time of flight mass spectrometer (MALDI-TOFMS) 

30 [0460] The sample solution for analysis was mixed in the equivalent amount with a solution of a peptide mixture for 
mass calibration (300 nmol/l Angiotensin II, 300 nmol/l Neurotensin, 150 nmol/l ACTHclip 18-39, 2.3 (xmol/l bovine 
insulin B chain), and 1 jjlI of the obtained solution was spotted on a stainless probe and crystallized by spontaneously 
drying. 

[0461] As measurement instruments, REFLEX MALDI-TOF mass spectrometer (manufactured by Bruker) and an 
35 N2 laser (337 nm) were used in combination. 

[0462] The analysis by PMF (peptide-mass finger printing) was carried out using integration spectra data obtained 
by measuring 30 times at an accelerated voltage of 19.0 kV and a detector voltage of 1.50 kV under reflector mode 
conditions. Mass calibration was carried out by the internal standard method. 

[0463] The PSD (post-source decay) analysis was carried out using integration spectra obtained by successively 
40 altering the reflection voltage and the detector voltage at an accelerated voltage of 27.5 kV. 

[0464] The masses and amino acid sequences of the peptide fragments derived from the protein spot after digestion 
were thus determined. 

(6) Identification of protein spot 

45 

[0465] From the amino acid sequence information of the digested peptide fragments derived from the protein spot 
obtained in the above (5), ORFs corresponding to the protein were searched on the genome sequence database of 
Corynebacterium glutamicum ATCC 13032 as constructed in Example 1 to identify the protein. 
[0466] The identification of the protein was carried out using MS-Fit program and MS-Tag program of intranet protein 
50 prospector. 

(a) Search and identification of gene encoding high-expression protein 

[0467] In the proteins derived from Corynebacterium glutamicum ATCC 1 3032 showing high expression amounts in 
55 CBB-staining shown in Fig. 2A, the proteins corresponding to Spots-1 ,2,3,4 and 5 were identified by the above method. 
[0468] As a result it f?'jnd that Spot 1 ccrrccpcr.dGd to er.olase wi.^li was a ptuiein having tne amino acid 
sequence of SEQ ID NO:4585; Spot-2 corresponded to phosphoglycelate kinase which was a protein having the amino 
acid sequence of SEQ ID NO:5254; Spot-3 corresponded to glyceratdehyde-3-phosphate dehydrogenase which was 
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a protein having the amino acid sequence represented by SEQ ID NO:5255; Spot-4 corresponded to fructose bis- 
phosphate aldolase which was a protein having the amino acid sequence represented by SEQ ID NO:6543; and Spot- 
5 corresponded to those phosphate isomerase which was a protein having the amino acid sequence represented by 
SEQ ID NO:5252. 

[0469] These genes, represented by SEQ ID NOS:1085, 1754, 1775, 3043 and 1752 encoding the proteins corre- 
sponding to Spots-1, 2, 3, 4 and 5, respectively, encoding the known proteins are important in the central metabolic 
pathway for maintaining the life of the microorganism. Particularly, it is suggested that the genes of Spots-2, 3 and 5 
form an operon and a high-expression promoter is encoded in the upstream thereof (J. of Eacteriol., 174: 6067-6086 
(1992)). 

[0470] Also, the protein corresponding to Spot-9 in Fig. 2 was identified in the same manner as described above, 
and it was found that Spot-9 was an elongation factor Tu which was a protein having the amino acid sequence repre- 
sented by SEQ ID No:6937, and that the protein was encoded by DNA having the nucleotide sequence represented 
by SEQ ID No:3437. 

[0471] Based on these results, the proteins having high expression level were identified by proteome analysis using 
the genome sequence database of Corynebacterium glutamicum constructed in Example 1 . Thus, the nucleotide se- 
quences of the genes encoding the proteins and the nucleotide sequences upstream thereof could be searched simul- 
taneously. Accordingly, it is shown that nucleotide sequences having a function as a high-expression promoter can be 
efficiently selected. 

(b) Search and identification of modified protein 

[0472] Among the proteins derived from Corynebacterium glutamicum FERM BP-7134 shown in Fig. 2B, S pots-6, 
7 --and-S were ideniified by ihe above fnethocl. As a res u If, theselh ree s pots all corresponded to catalase which was a 
protein having the amino acid sequence represented by SEQ ID NO:3785. 

[0473] Accordingly, all of Spots-6, 7 and 8 detected as spots differing in isoelectric mobility were all products derived 
from a catalase gene having the nucleotide sequence represented by SEQ ID No:285. Accordingly, it is shown that 
the catalase derived from Corynebacterium glutamicum FERM BP-7134 was modified after the translation. 
[0474] Based on these results, it is confirmed that various modified proteins can be efficiently searched by proteome 
analysis using the genome sequence database of Corynebacterium glutamicum constructed in Example 1 . 

(c) Search and identification of expressed protein effective in lysine production 

[0475] It was found out that in Fig. 2A (ATCC 13032: wild type strain), Fig. 2B (FERM BP-7134: lysine-producing 
strain) and Fig. 2C (FERM BP-158: lysine-highly producing strain), the catalase corresponding to Spot-8 and the elon- 
gation factor Tu corresponding to Spot-9 as identified above showed the higher expression level with an increase in 
the lysine productivity. 

[0476] Based on these results, it was found that hopeful mutated proteins can be efficiently searched and identified 
in breeding aiming at strengthening the productivity of a target product by the proteome analysis using the genome 
sequence database of Corynebacterium glutamicum constructed in Example 1. 

[0477] Moreover, useful mutation points of useful mutants can be easily specified by searching the nucleotide se- 
quences (nucleotide sequences of promoter, ORF, or the like) relating to the identified proteins using the above data- 
base and using primers designed on the basis of the sequences. As a result of the fact that the mutation points are 
specified, industrially useful mutants which have the useful mutations or other useful mutations derived therefrom can 
be easily bred. 

[0478] While the invention has been described in detail and with reference to specific embodiments thereof, it will 
be apparent to one of skill in the art that various changes and modifications can be made therein without departing 
from the spirit and scope thereof. All references cited herein are incorporated in their entirety. 

Claims 

1. A method for at least one of the following: 

(A) identifying a mutation point of a gene derived from a mutant of a coryneform bacterium, 

(B) measuring an expression amount of a gene derived from a coryneform bacterium, 

(C) analyzing an cxprcccicr. profile of a gene uciivcJ iium a curynerorm bacterium, 

(D) analyzing expression patterns of genes derived from a coryneform bacterium, or 

(E) identifying a gene homologous to a gene derived from a coryneform bacterium, 
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said method comprising: 

(a) producing a polynucleotide array by adhering to a solid support at least two polynucleotides selected 
from the group consisting of first polynucleotides comprising the nucleotide sequence represented by any 
oneofSEQ ID NOS:1 to 3501, second polynucleotides which hybridize with the first polynucleotides under 
stringent conditions, and third polynucleotides comprising a sequence of 10 to 200 continuous bases of 
the first or second polynucleotides, 

(b) incubating the polynucleotide array with at least one of a labeled polynucleotide derived from a co- 
ryneform bacterium, a labeled polynucleotide derived from a mutant of the coryneform bacterium or a 
labeled polynucleotide to be examined, under hybridization conditions, 

(c) detecting any hybridization, and 

(d) analyzing the result of the hybridization. 

2. The method according to claim 1 , wherein the coryneform bacterium is a microorganism belonging to the genus 
Corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

3. The method according to claim 2, wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, Corynebacterium 
acetoglutamicum, Corynebacterium cai/unae, Corynebacterium hercuiis, Corynebacterium iilium, Corynebacteri- 
um melassecoia, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

4. The method according to claim 1 , wherein the polynucleotide derived from a coryneform bacterium, the polynuce- 
'-'Gtid3-derived-froffi-a - mutantof~th or trTe~polynucre6tide to be examined is a gene relating 
to the biosynthesis of at least one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, 
an organic acid, and analogues thereof. = . __ _.._=- 

5. The method according to claim 1, wherein the polynucleotide to be examined is derived from Escherichia coli. 

6. A polynucleotide array, comprising: 

at least two polynucleotides selected from the group consisting of first polynucleotides comprising the nucle- 
otide sequence represented by any one of SEQ ID NOS:1 to 3501, second polynucleotides which hybridize 
with the first polynucleotides under stringent conditions, and third polynucleotides comprising 10 to 200 con- 
tinuous bases of the first or second polynucleotides, and 
a solid support adhered thereto. 

7. A polynucleotide comprising the nucleotide sequence represented by SEQ ID NO:1 or a polynucleotide having a 
homology of at least 80% with the polynucleotide. 

8. A polynucleotide comprising any one of the nucleotide sequences represented by SEQ ID NOS:2 to 3431, or a 
polynucleotide which hybridizes with the polynucleotide under stringent conditions. 

9. A polynucleotide encoding a polypeptide having any one of the amino acid sequences represented by SEQ ID 
NOS:3502 to 6931, or a polynucleotide which hybridizes therewith under stringent conditions. 

1 0. A polynucleotide which is present in the 5* upstream or 3' downstream of a polynucleotide comprising the nucleotide 
sequence of any one of SEQ ID NOS:2 to 3431 in a whole polynucleotide comprising the nucleotide sequence 
represented by SEQ ID NO:1, and has an activity of regulating an expression of the polynucleotide. 

11. A polynucleotide comprising 10 to 200 continuous bases in the nucleotide sequence of the polynucleotide of any 
one of claims 7 to 10, or a polynucleotide comprising a nucleotide sequence complementary to the polynucleotide 
comprising 10 to 200 continuous based. 

12. A recombinant DNA comprising the polynucleotide of any one of claims 8 to 11 . 

13. A transfnrmant mmnricirig th e polynucleotide cf cr.y or.a of claims Glu m ui me recombinant Dina ot claim VZ. 

14. A method for producing a polypeptide, comprising: 



236 



EP 1 108 790 A2 



culturing the transformant of claim 13 in a medium to produce and accumulate a polypeptide encoded by the 
polynucleotide of claim 8 or 9 in the medium, and 
recovering the polypeptide from the medium. 

5 15. A method for producing at least one of an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and 
analogues thereof, comprising: 

culturing the transformant of claim 13 in a medium to produce and accumulate at least one of an amino acid, 
a nucleic acid, a vitamin, a saccharide, an organic acid, and analogues thereof in the medium, and 
10 recovering the at least one of the amino acid, the nucleic acid, the vitamin, the saccharide, the organic acid, 

and analogues thereof from the medium. 

16. A polypeptide encoded by a polynucleotide comprising the nucleotide sequence selected from SEQ ID NOS:2 to 
3431. 

15 

17. A polypeptide comprising the amino acid sequence selected from SEQ ID NOS:3502 to 6931. 

18. The polypeptide according to claim 16 or 17, wherein at least one amino acid is deleted, replaced, inserted or 
added, said polypeptides having an activity which is substantially the same as that of the polypeptide without said 

20 at least one amino acid deletion, replacement, insertion or addition. 

19. A polypeptide comprising an amino acid sequence having a homology of at least 60% with t he amino acid seq uence 
of the polypeptide of claim 16 or 17, alid having ah activity which is substantially the same as that of the polypeptide. 

25 20. An antibody which recognizes the polypeptide of any one of claims 16 to 19. 

21. A polypeptide array, comprising: 

at least one polypeptide or partial fragment polypeptide selected from the polypeptides of claims 16 to 19 and 
30 partial fragment polypeptides of the polypeptides, and 

a solid support adhered thereto. 

22. A polypeptide array, comprising: 

35 at least one antibody which recognizes a polypeptide or partial fragment polypeptide selected from the polypep- 

tides of claims 16 to 19 and partial fragment polypeptides of the polypeptides, and 
a solid support adhered thereto. 

23. A system based on a computer for identifying a target sequence or a target structure motif derived from a coryne- 
<o form bacterium, comprising the following: 

(i) a user input device that inputs at least one nucleotide sequence information selected from SEQ ID NOS:1 
to 3501, and target sequence or target structure motif information; 

(ii) a data storage device for at least temporarily storing the input information; 

45 (iii) a comparator that compares the at least one nucleotide sequence information selected from SEQ ID NOS: 

1 to 3501 with the target sequence or target structure motif information, recorded by the data storage device 
for screening and analyzing nucleotide sequence information which is coincident with or analogous to the 
target sequence or target structure motif information; and 

(iv) an output device that shows a screening or analyzing result obtained by the comparator. 

50 

24. A method based on a computer for identifying a target sequence or a target structure motif derived from a coryne- 
form bacterium, comprising the following: 

(i) inputting at least one nucleotide sequence information selected from SEQ ID NOS:1 to 3501, target se- 
55 quence information or target structure motif information into a user input device; 

(ii) !CC3t tGiT.pcraMiy SiC/miy oenu ii ifui i iiaiiuii; 

(iii) comparing the at least one nucleotide sequence information selected from SEQ ID NOS:1 to 3501 with 
the target sequence or target structure motif information; and 
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(iv) screening and analyzing nucleotide sequence information which is coincident with or analogous to the 
target sequence or target structure motif information. 

25. A system based on a computer for identifying a target sequence or a target structure motif derived from a coryne- 
5 form bacterium, comprising the following: 

(i) a user input device that inputs at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001, and target sequence or target structure motif information; 

(ii) a data storage device for at least temporarily storing the input information; 

10 (iii) a comparator that compares the at least one amino acid sequence information selected from SEQ ID NOS: 

3502 to 7001 with the target sequence or target structure motif information, recorded by the data storage 
device for screening and analyzing amino acid sequence information which is coincident with or analogous to 
the target sequence or target structure motif information; and 

(iv) an output device that shows a screening or analyzing result obtained by the comparator. 

15 

26. A method based on a computer for identifying a target sequence or a target structure motif derived from a coryne- 
form bacterium, comprising the following: 

(i) inputting at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 , and target 
20 sequence information or target structure motif information into a user input device; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 
wi th ihe ta rget seq u e hce of taTgef stru cture motif information ; a nd 

(iv) screening and analyzing amino acid sequence information which is coincident with or analogous to the 
25 target sequence or target structure motif information. 

27. A system based on a computer for determining a function of a polypeptide encoded by a polynucleotide having a 
target nucleotide sequence derived from a coryneform bacterium, comprising the following: 

30 (i) a user input device that inputs at least one nucleotide sequence information selected from SEQ ID NOS:2 

to 3501, function information of a polypeptide encoded by the nucleotide sequence, and target nucleotide 
sequence information; 

(ii) a data storage device for at least temporarily storing the input information; 

(iii) a comparator that compares the at least one nucleotide sequence information selected from SEQ ID NOS: 
35 2 to 3501 with the target nucleotide sequence information for determining a function of a polypeptide encoded 

by a polynucleotide having the target nucleotide sequence which is coincident with or analogous to the poly- 
nucleotide having at least one nucleotide sequence selected from SEQ ID NOS:2 to 3501; and 

(iv) an output devices that shows a function obtained by the comparator. 

40 28. A method based on a computer for determining a function of a polypeptide encoded by a polypeptide encoded by 
a polynucleotide having a target nucleotide sequence derived from a coryneform bacterium, comprising the fol- 
lowing: 

(i) inputting at least one nucleotide sequence information selected from SEQ ID NOS:2 to 3501, function in- 
45 formation of a polypeptide encoded by the nucleotide sequence, and target nucleotide sequence information; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one nucleotide sequence information selected from SEQ ID NOS:2 to 3501 with 
the target nucleotide sequence information; and 

(iv) determining a function of a polypeptide encoded by a polynucleotide having the target nucleotide sequence 
50 which is coincident with or analogous to the polynucleotide having at least one nucleotide sequence selected 

from SEQ ID NOS:2 to 3501. 

29. A system based on a computer for determining a function of a polypeptide having a target amino acid sequence 
derived from a coryneform bacterium, comprising the following: 

55 

(i) a 'jser input device i. u .ct ir.piits at leas! one cjniinu aciu sequence imormation selected from SEQ ID NOS: 
3502 to 7001, function information based on the amino acid sequence, and target amino acid sequence infor- 
mation; 
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(ii) a data storing device for at least temporarily storing the input information; 

(iii) a comparator that compares the at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 with the target amino acid sequence information for determining a function of a polypeptide 
having the target amino acid sequence which is coincident with or analogous to the polypeptide having at least 

5 one amino acid sequence selected from SEQ ID NOS:3502 to 7001 ; and 

(iv) an output device that shows a function obtained by the comparator. 

30. A method based on a computer for determining a function of a polypeptide having a target amino acid sequence 
derived from a coryneform bacterium, comprising the following: 

10 

(i) inputting at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001, function 
information based on the amino acid sequence, and target amino acid sequence information; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 
15 with the target amino acid sequence information; and 

(iv) determining a function of a polypeptide having the target amino acid sequence which is coincident with or 
analogous to the polypeptide having at least one amino acid sequence selected from SEQ ID NOS:3502 to 
7001. 



31 . The system according to any one of claims 23, 25, 27 and 29, wherein a coryneform bacterium is a microorganism 
of the genus Corynebactehum, the genus Brevibacterium, or the genus Microbacterium. 

32. -The -method-according io any one of claims 24, 26, 28 and 30, wherein a coryneform bacterium is a microorganism 
of the genus Corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

33. The system according to claim 31 , wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophifum, Corynebacterium 
acetoglutamicum, Corynebacterium cailunae, Corynebacterium hercufis, Corynebacterium lilium, Corynebacteri- 
um me/assecola, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

34. The method according to claim 32, wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidopbilum, Corynebacterium 
acetoglutamicum, Corynebacterium cailunae, Corynebacterium herculis, Corynebacterium lilium, Corynebacteri- 
um metassecola, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

35. A recording medium or storage device which is readable by a computer in which at least one nucleotide sequence 
information selected from SEQ ID NOS:1 to 3501 or function information based on the nucleotide sequence is 
recorded, and is usable in the system of claim 23 or 27 or the method of claim 24 or 28. 

36. A recording medium or storage device which is readable by a computer in which at least one amino acid sequence 
information selected from SEQ ID NOS:3502 to 7001 or function information based on the amino acid sequence 
is recorded, and is usable in the system of claim 25 or 29 or the method of claim 26 or 30. 

37. The recording medium or storage device according to claim 35 or 36, which is a computer readable recording 
medium selected from the group consisting of a floppy disc, a hard disc, a magnetic tape, a random access memory 
(RAM), a read only memory (ROM), a magneto-optic disc (MO), CD-ROM, CD-R, CD-RW, DVD-ROM DVD-RAM 
and DVD-RW. 



50 



38. A polypeptide having a homoserine dehydrogenase activity, comprising an amino acid sequence in which the Val 
residue at the 59th in the amino acid sequence of homoserine dehydrogenase derived from a coryneform bacterium 
is replaced with an amino acid residue other than a Val residue. 

39. A polypeptide comprising an amino acid sequence in which the Val residue at the 59th position in the amino acid 
sequence as represented by SEQ ID NO:6952 is replaced with an amino acid residue other than a Val residue. 

40. ThP nnivnontiHo according tc c!c:~ 22 or 20, wherein u.e Vai msiuue at the oyin position is replaced with an Ala 
residue. 



239 



EP 1 108 790 A2 



41. A polypeptide having pyruvate carboxylase activity, comprising an amino acid sequence in which the Pro residue 
at the 458th position in the amino acid sequence of pyruvate carboxylase derived from a coryneform bacterium is 
replaced with an amino acid residue other than a Pro residue. 

42. A polypeptide comprising an amino acid sequence in which the Pro residue at the 458th position in the amino acid 
sequence represented by SEQ ID NO:4265 is replaced with an amino acid residue other than a Pro residue. 

43. The polypeptide according to claim 41 or 42, wherein the Pro residue at the 458th position is replaced with a Ser 
residue. 

44. The polypeptide according to any one of claims 38 to 43, which is derived from Corynebacterium glutamicum. 

45. A DNA encoding the polypeptide of any one of claims 38 to 44. 

46. A recombinant DNA comprising the DNA of claim 45. 

47. A transformant comprising the recombinant DNA of claim 46. 

48. A transformant comprising in its chromosome the DNA of claim 45. 

49. The transformant according to claim 47 or 48, which is derived from a coryneform bacterium. 

50. The transforrnani accordingto ciaifh 49, which is derived from Corynebacterium glutamicum. 

51. A method for producing L-lysine, comprising: 

culturing the transformant of any one of claims 47 to 50 in a medium to produce and accumulate L-lysine in 
the medium, and 

recovering the L-lysine from the culture. 

52. A method for breeding a coryneform bacterium using the nucleotide sequence information represented by SEQ 
ID NOS:1 to 3431, comprising the following: 

(i) comparing a nucleotide sequence of a genome or gene of a production strain derived a coryneform bacte- 
rium which has been subjected to mutation breeding so as to produce at least one compound selected from 
an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof by a fermentation 
method, with a corresponding nucleotide sequence in SEQ ID NOS:1 to 3431; 

(ii) identifying a mutation point present in the production strain based on a result obtained by (i); 

(iii) introducing the mutation point into a coryneform bacterium which is free of the mutation point, or deleting 
the mutation point from a coryneform bacterium having the mutation point; and 

(iv) examining productivity by the fermentation method of the compound selected in (i) of the coryneform 
bacterium obtained in (iii). 

53. The method according to claim 52, wherein the gene is a gene encoding an enzyme in a biosynthetic pathway or 
a signal transmission pathway. 

54. The method according to claim 52, wherein the mutation point is a mutation point relating to a useful mutation 
which improves or stabilizes the productivity. 

55. A method for breading a coryneform bacterium using the nucleotide sequence information represented by SEQ 
ID NOS:1 to 3431, comprising: 

(i) comparing a nucleotide sequence of a genome or gene of a production strain derived a coryneform bacte- 
rium which has been subjected to mutation breeding so as to produce at least one compound selected from 
an amino acid, a nucleic acid , a vitamin, a saccharide, an organic acid, and analogous thereof by a fermentation 
moth 0 H with 2 ccrrccpc.-dir.g nucleotide Sc M ucnue in SEu iu NOSn to 3431 ; 

(ii) identifying a mutation point present in the production strain based on a result obtain by (i); 

(iii) deleting a mutation point from a coryneform bacterium having the mutation point; and 
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(iv) examining productivity by the fermentation method of the compound selected in (i) of the coryneform 
bacterium obtained in (iii). 

56. The method according to claim 55, wherein the gene is a gene encoding an enzyme in a biosynthetic pathway or 
a signal transmission pathway. 

57. The method according to claim 55, wherein the mutation point is a mutation point which decreases or destabilizes 
the productivity. 

58. A method for breeding a coryneform bacterium using the nucleotide sequence information represented by SEQ 
ID NOS:2 to 3431, comprising the following: 

(i) identifying an isozyme relating to biosynthesis of at least one compound selected from an amino acid, a 
nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof, based on the nucleotide se- 
quence information represented by SEQ ID NOS:2 to 3431; 

(ii) classifying the isozyme identified in (i) into an isozyme having the same activity; 

(iii) mutating all genes encoding the isozyme having the same activity simultaneously; and 

(iv) examining productivity by a fermentation method of the compound selected in (i) of the coryneform bac- 
terium which have been transformed with the gene obtained in (iii). 

59. A method for breeding a coryneform bacterium using the nucleotide sequence information represented by SEQ 
ID NOS:2 to 3431, comprising the following: 

(i) arranging a function information of an open reading frame (ORF) represented by SEQ ID NOS:2 to 3431; 

(ii) allowing the arranged ORF to correspond to an enzyme on a known biosynthesis or signal transmission- 
pathway; 

(iii) explicating an unknown biosynthesis pathway or signal transmission pathway of a coryneform bacterium 
in combination with information relating known biosynthesis pathway or signal transmission pathway of a co- 
ryneform bacterium; 

(iv) comparing the pathway explicated in (iii) with a biosynthesis pathway of a target useful product; and 

(v) transgenetically varying a coryneform bacterium based on the nucleotide sequence information to either 
strengthen a pathway which is judged to be important in the biosynthesis of the target useful product in (iv) or 
weaken a pathway which is judged not to be important in the biosynthesis of the target useful product in (iv). 

60. A coryneform bacterium, bred by the method of any one of claims 52 to 59. 

61. The coryneform bacterium according to claim 60, which is a microorganism belonging to the genus Corynebacte- 
rium, the genus Brevibacterium, or the genus Microbacterium. 

62. The coryneform bacterium according to claim 61, wherein the microorganism belonging to the genus Corynebac- 
terium is selected from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, 
Corynebacterium acetoglutamicum, Corynebacterium callunae, Corynebacterium hercuiis, corynebacterium ///- 
ium, Corynebacterium melassecola, Corynebacterium thermoamino genes, and Corynebacterium ammonia 
genes. 

63. A method for producing at least one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, 
an organic acid and an analogue thereof, comprising: 

culturing a coryneform bacterium of any one of claims 60 to 62 in a medium to produce and accumulate at 
least one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, 
and analogues thereof; 
recovering the compound from the culture. 

64. The method according to claim 63, wherein the compound is L-lysine. 

65. A method for identifying * nrntpin roiatmg tc — iciicr, baseJ c, H >u[euine analysis, comprising the following: 

(i) preparing 
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a protein derived from a bacterium of a production strain of a coryneform bacterium which has been sub- 
jected to mutation breeding by a fermentation process so as to produce at least one compound selected 
from an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogues thereof, and 
a protein derived from a bacterium of a parent strain of the production strain; 

(ii) separating the proteins prepared in (i) by two dimensional electrophoresis; 

(iii) detecting the separated proteins, and comparing an expression amount of the protein derived from the 
production strain with that derived from the parent strain; 

<iv) treating the protein showing different expression amounts as a result of the comparison with a peptidase 
to extract peptide fragments; 

(v) analyzing amino acid sequences of the peptide fragments obtained in (iv); and 

(vi) comparing the amino acid sequences obtained in (v) with the amino acid sequence represented by SEQ 
ID NOS:3502 to 7001 to identifying the protein having the amino acid sequences. 

66. The method according to claim 65, wherein the coryneform bacterium is a microorganism belonging to the genus 
corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

67. The method according to claim 66, wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophifum, Corynebacterium 
acetoglutamicum, Corynebacterium callunae, Corynebacterium herculis, Corynebacterium liiium, Corynebacteri- 
um melassecola, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

68. A-biologicaiJy purecuiiureof Corynebacterium glutamicum AHP-3 (FERM BP-7382) . 
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FIG. 4 
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